Merge remote-tracking branch 'acme/perf/urgent' into perf/core

author Arnaldo Carvalho de Melo <acme@redhat.com>

Wed, 4 Mar 2020 13:29:19 +0000 (10:29 -0300)

committer Arnaldo Carvalho de Melo <acme@redhat.com>

Wed, 4 Mar 2020 13:29:19 +0000 (10:29 -0300)
author Arnaldo Carvalho de Melo <acme@redhat.com>
Wed, 4 Mar 2020 13:29:19 +0000 (10:29 -0300)
committer Arnaldo Carvalho de Melo <acme@redhat.com>
Wed, 4 Mar 2020 13:29:19 +0000 (10:29 -0300)
diff --git a/.gitignore b/.gitignore

index 72ef86a5570d28015d0ccb95ccd212bf8820c1c2..2763fce8766c09417fa6a4b9885e04a66c409b03 100644 (file)
--- a/.gitignore
+++ b/.gitignore
@@ -100,6 +100,10 @@ modules.order
  /include/ksym/
  /arch/*/include/generated/
  
+# Generated lkdtm tests
+/tools/testing/selftests/lkdtm/*.sh
+!/tools/testing/selftests/lkdtm/run.sh
+
  # stgit generated dirs
  patches-*
  
diff --git a/COPYING b/COPYING

index da4cb28febe66172a9fdf1a235525ae6c00cde1d..a635a38ef9405fdfcfe97f3a435393c1e9cae971 100644 (file)
--- a/COPYING
+++ b/COPYING
@@ -16,3 +16,5 @@ In addition, other licenses may also apply. Please see:
         Documentation/process/license-rules.rst
  
  for more details.
+
+All contributions to the Linux Kernel are subject to this COPYING file.
diff --git a/CREDITS b/CREDITS

index a97d3280a627b3665e68c94a574409f71cee1da6..032b5994f4760a13770d1629b8b057195d9266c2 100644 (file)
--- a/CREDITS
+++ b/CREDITS
@@ -567,6 +567,11 @@ D: Original author of Amiga FFS filesystem
  S: Orlando, Florida
  S: USA
  
+N: Paul Burton
+E: paulburton@kernel.org
+W: https://pburton.com
+D: MIPS maintainer 2018-2020
+
  N: Lennert Buytenhek
  E: kernel@wantstofly.org
  D: Original (2.4) rewrite of the ethernet bridging code
diff --git a/Documentation/admin-guide/bootconfig.rst b/Documentation/admin-guide/bootconfig.rst

index b342a679639277c41100b13d9cdf6b88d70040b8..cf2edcd09183bef328294cbe67b28983149d283f 100644 (file)
--- a/Documentation/admin-guide/bootconfig.rst
+++ b/Documentation/admin-guide/bootconfig.rst
@@ -62,6 +62,30 @@ Or more shorter, written as following::
  In both styles, same key words are automatically merged when parsing it
  at boot time. So you can append similar trees or key-values.
  
+Same-key Values
+---------------
+
+It is prohibited that two or more values or arrays share a same-key.
+For example,::
+
+ foo = bar, baz
+ foo = qux  # !ERROR! we can not re-define same key
+
+If you want to append the value to existing key as an array member,
+you can use ``+=`` operator. For example::
+
+ foo = bar, baz
+ foo += qux
+
+In this case, the key ``foo`` has ``bar``, ``baz`` and ``qux``.
+
+However, a sub-key and a value can not co-exist under a parent key.
+For example, following config is NOT allowed.::
+
+ foo = value1
+ foo.bar = value2 # !ERROR! subkey "bar" and value "value1" can NOT co-exist
+
+
  Comments
  --------
  
@@ -102,9 +126,13 @@ Boot Kernel With a Boot Config
  ==============================
  
  Since the boot configuration file is loaded with initrd, it will be added
-to the end of the initrd (initramfs) image file. The Linux kernel decodes
-the last part of the initrd image in memory to get the boot configuration
-data.
+to the end of the initrd (initramfs) image file with size, checksum and
+12-byte magic word as below.
+
+[initrd][bootconfig][size(u32)][checksum(u32)][#BOOTCONFIG\n]
+
+The Linux kernel decodes the last part of the initrd image in memory to
+get the boot configuration data.
  Because of this "piggyback" method, there is no need to change or
  update the boot loader and the kernel image itself.
  
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt

index dbc22d68462751d2bb59ab35784c1c61c84bbb0a..c07815d230bcd4bde32a93c0615326f37a68824c 100644 (file)
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -136,6 +136,10 @@
                         dynamic table installation which will install SSDT
                         tables to /sys/firmware/acpi/tables/dynamic.
  
+       acpi_no_watchdog        [HW,ACPI,WDT]
+                       Ignore the ACPI-based watchdog interface (WDAT) and let
+                       a native driver control the watchdog device instead.
+
         acpi_rsdp=      [ACPI,EFI,KEXEC]
                         Pass the RSDP address to the kernel, mostly used
                         on machines running EFI runtime service to boot the
diff --git a/Documentation/arm64/memory.rst b/Documentation/arm64/memory.rst

index 02e02175e6f56a75da568cd0112bf71769b8f938..cf03b3290800c25ea7c5d8cba093b0189e3d727a 100644 (file)
--- a/Documentation/arm64/memory.rst
+++ b/Documentation/arm64/memory.rst
@@ -129,7 +129,7 @@ this logic.
  
  As a single binary will need to support both 48-bit and 52-bit VA
  spaces, the VMEMMAP must be sized large enough for 52-bit VAs and
-also must be sized large enought to accommodate a fixed PAGE_OFFSET.
+also must be sized large enough to accommodate a fixed PAGE_OFFSET.
  
  Most code in the kernel should not need to consider the VA_BITS, for
  code that does need to know the VA size the variables are
diff --git a/Documentation/arm64/tagged-address-abi.rst b/Documentation/arm64/tagged-address-abi.rst

index d4a85d535bf99e09cfe5c3dc41652eaa3b64f21b..4a9d9c794ee5d889638d1a3af008dec06d11feaf 100644 (file)
--- a/Documentation/arm64/tagged-address-abi.rst
+++ b/Documentation/arm64/tagged-address-abi.rst
@@ -44,8 +44,15 @@ The AArch64 Tagged Address ABI has two stages of relaxation depending
  how the user addresses are used by the kernel:
  
  1. User addresses not accessed by the kernel but used for address space
-   management (e.g. ``mmap()``, ``mprotect()``, ``madvise()``). The use
-   of valid tagged pointers in this context is always allowed.
+   management (e.g. ``mprotect()``, ``madvise()``). The use of valid
+   tagged pointers in this context is allowed with the exception of
+   ``brk()``, ``mmap()`` and the ``new_address`` argument to
+   ``mremap()`` as these have the potential to alias with existing
+   user addresses.
+
+   NOTE: This behaviour changed in v5.6 and so some earlier kernels may
+   incorrectly accept valid tagged pointers for the ``brk()``,
+   ``mmap()`` and ``mremap()`` system calls.
  
  2. User addresses accessed by the kernel (e.g. ``write()``). This ABI
     relaxation is disabled by default and the application thread needs to
diff --git a/Documentation/dev-tools/kunit/usage.rst b/Documentation/dev-tools/kunit/usage.rst

index 7cd56a1993b14ad75e5d21c5b3c23c29bee8f1cb..607758a66a99cc04b6abb71b3f3d610a3ccaf26c 100644 (file)
--- a/Documentation/dev-tools/kunit/usage.rst
+++ b/Documentation/dev-tools/kunit/usage.rst
@@ -551,6 +551,7 @@ options to your ``.config``:
  Once the kernel is built and installed, a simple
  
  .. code-block:: bash
+
         modprobe example-test
  
  ...will run the tests.
diff --git a/Documentation/devicetree/bindings/display/allwinner,sun4i-a10-tcon.yaml b/Documentation/devicetree/bindings/display/allwinner,sun4i-a10-tcon.yaml

index 86ad617d2327d459bd6c11c95705248f35b9df0d..5ff9cf26ca380b4886a8848ee308e3f8a9838f6a 100644 (file)
--- a/Documentation/devicetree/bindings/display/allwinner,sun4i-a10-tcon.yaml
+++ b/Documentation/devicetree/bindings/display/allwinner,sun4i-a10-tcon.yaml
@@ -43,9 +43,13 @@ properties:
          - enum:
            - allwinner,sun8i-h3-tcon-tv
            - allwinner,sun50i-a64-tcon-tv
-          - allwinner,sun50i-h6-tcon-tv
          - const: allwinner,sun8i-a83t-tcon-tv
  
+      - items:
+        - enum:
+          - allwinner,sun50i-h6-tcon-tv
+        - const: allwinner,sun8i-r40-tcon-tv
+
    reg:
      maxItems: 1
  
diff --git a/Documentation/devicetree/bindings/input/ilitek,ili2xxx.txt b/Documentation/devicetree/bindings/input/ilitek,ili2xxx.txt

index dc194b2c151ac710b64dcc0c36709656f7b103ae..cdcaa3f52d25367074161e4f80049f66402dc972 100644 (file)
--- a/Documentation/devicetree/bindings/input/ilitek,ili2xxx.txt
+++ b/Documentation/devicetree/bindings/input/ilitek,ili2xxx.txt
@@ -1,9 +1,10 @@
-Ilitek ILI210x/ILI2117/ILI251x touchscreen controller
+Ilitek ILI210x/ILI2117/ILI2120/ILI251x touchscreen controller
  
  Required properties:
  - compatible:
      ilitek,ili210x for ILI210x
      ilitek,ili2117 for ILI2117
+    ilitek,ili2120 for ILI2120
      ilitek,ili251x for ILI251x
  
  - reg: The I2C address of the device
diff --git a/Documentation/devicetree/bindings/media/allwinner,sun4i-a10-csi.yaml b/Documentation/devicetree/bindings/media/allwinner,sun4i-a10-csi.yaml

index 9af873b43acd8742ade8ed944e2cfea4ec045369..8453ee340b9fb0abffb19f48c95391bb0e93e9c9 100644 (file)
--- a/Documentation/devicetree/bindings/media/allwinner,sun4i-a10-csi.yaml
+++ b/Documentation/devicetree/bindings/media/allwinner,sun4i-a10-csi.yaml
@@ -33,24 +33,40 @@ properties:
      maxItems: 1
  
    clocks:
-    minItems: 2
-    maxItems: 3
-    items:
-      - description: The CSI interface clock
-      - description: The CSI ISP clock
-      - description: The CSI DRAM clock
+    oneOf:
+      - items:
+        - description: The CSI interface clock
+        - description: The CSI DRAM clock
+
+      - items:
+        - description: The CSI interface clock
+        - description: The CSI ISP clock
+        - description: The CSI DRAM clock
  
    clock-names:
-    minItems: 2
-    maxItems: 3
-    items:
-      - const: bus
-      - const: isp
-      - const: ram
+    oneOf:
+      - items:
+        - const: bus
+        - const: ram
+
+      - items:
+        - const: bus
+        - const: isp
+        - const: ram
  
    resets:
      maxItems: 1
  
+  # FIXME: This should be made required eventually once every SoC will
+  # have the MBUS declared.
+  interconnects:
+    maxItems: 1
+
+  # FIXME: This should be made required eventually once every SoC will
+  # have the MBUS declared.
+  interconnect-names:
+    const: dma-mem
+
    # See ./video-interfaces.txt for details
    port:
      type: object
diff --git a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-emc.yaml b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-emc.yaml

index dd1843489ad15075fa2b31c48fa00e18d40ae032..3e0a8a92d6529ef6f2ed0901b9264e1496f7d3b8 100644 (file)
--- a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-emc.yaml
+++ b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-emc.yaml
@@ -347,6 +347,7 @@ examples:
          interrupts = <GIC_SPI 77 IRQ_TYPE_LEVEL_HIGH>;
  
          #iommu-cells = <1>;
+        #reset-cells = <1>;
      };
  
      external-memory-controller@7001b000 {
@@ -363,20 +364,23 @@ examples:
              timing-0 {
                  clock-frequency = <12750000>;
  
-                nvidia,emc-zcal-cnt-long = <0x00000042>;
-                nvidia,emc-auto-cal-interval = <0x001fffff>;
-                nvidia,emc-ctt-term-ctrl = <0x00000802>;
-                nvidia,emc-cfg = <0x73240000>;
-                nvidia,emc-cfg-2 = <0x000008c5>;
-                nvidia,emc-sel-dpd-ctrl = <0x00040128>;
-                nvidia,emc-bgbias-ctl0 = <0x00000008>;
                  nvidia,emc-auto-cal-config = <0xa1430000>;
                  nvidia,emc-auto-cal-config2 = <0x00000000>;
                  nvidia,emc-auto-cal-config3 = <0x00000000>;
-                nvidia,emc-mode-reset = <0x80001221>;
+                nvidia,emc-auto-cal-interval = <0x001fffff>;
+                nvidia,emc-bgbias-ctl0 = <0x00000008>;
+                nvidia,emc-cfg = <0x73240000>;
+                nvidia,emc-cfg-2 = <0x000008c5>;
+                nvidia,emc-ctt-term-ctrl = <0x00000802>;
                  nvidia,emc-mode-1 = <0x80100003>;
                  nvidia,emc-mode-2 = <0x80200008>;
                  nvidia,emc-mode-4 = <0x00000000>;
+                nvidia,emc-mode-reset = <0x80001221>;
+                nvidia,emc-mrs-wait-cnt = <0x000e000e>;
+                nvidia,emc-sel-dpd-ctrl = <0x00040128>;
+                nvidia,emc-xm2dqspadctrl2 = <0x0130b118>;
+                nvidia,emc-zcal-cnt-long = <0x00000042>;
+                nvidia,emc-zcal-interval = <0x00000000>;
  
                  nvidia,emc-configuration = <
                      0x00000000 /* EMC_RC */
diff --git a/Documentation/devicetree/bindings/mmc/ti-omap-hsmmc.txt b/Documentation/devicetree/bindings/mmc/ti-omap-hsmmc.txt

index 19f5508a75696b722624c550700ee74f72b2362f..4a9145ef15d6b4a4485649eb85757445aa0bffb6 100644 (file)
--- a/Documentation/devicetree/bindings/mmc/ti-omap-hsmmc.txt
+++ b/Documentation/devicetree/bindings/mmc/ti-omap-hsmmc.txt
@@ -124,7 +124,7 @@ not every application needs SDIO irq, e.g. MMC cards.
                 pinctrl-1 = <&mmc1_idle>;
                 pinctrl-2 = <&mmc1_sleep>;
                 ...
-               interrupts-extended = <&intc 64 &gpio2 28 GPIO_ACTIVE_LOW>;
+               interrupts-extended = <&intc 64 &gpio2 28 IRQ_TYPE_LEVEL_LOW>;
         };
  
         mmc1_idle : pinmux_cirq_pin {
diff --git a/Documentation/devicetree/bindings/net/mdio.yaml b/Documentation/devicetree/bindings/net/mdio.yaml

index 5d08d2ffd4ebcc0af0d542fa2d00700f5382b555..50c3397a82bc4cea8efc364827c4fefb0e0838f8 100644 (file)
--- a/Documentation/devicetree/bindings/net/mdio.yaml
+++ b/Documentation/devicetree/bindings/net/mdio.yaml
@@ -56,7 +56,6 @@ patternProperties:
  examples:
    - |
      davinci_mdio: mdio@5c030000 {
-        compatible = "ti,davinci_mdio";
          reg = <0x5c030000 0x1000>;
          #address-cells = <1>;
          #size-cells = <0>;
diff --git a/Documentation/driver-api/ipmb.rst b/Documentation/driver-api/ipmb.rst

index 3ec3baed84c4490a724e4aef92f098664e689913..209c49e051163f2c411e5cbe2fa135438333c56b 100644 (file)
--- a/Documentation/driver-api/ipmb.rst
+++ b/Documentation/driver-api/ipmb.rst
@@ -71,9 +71,13 @@ b) Example for device tree::
              ipmb@10 {
                      compatible = "ipmb-dev";
                      reg = <0x10>;
+                    i2c-protocol;
              };
       };
  
+If xmit of data to be done using raw i2c block vs smbus
+then "i2c-protocol" needs to be defined as above.
+
  2) Manually from Linux::
  
       modprobe ipmb-dev-int
diff --git a/Documentation/filesystems/zonefs.txt b/Documentation/filesystems/zonefs.txt

index 935bf22031ca1ce8ea2df2cfc8cd6cb6a33d672b..d54fa98ac1582866ecb7abc4cf19d0790ea3673d 100644 (file)
--- a/Documentation/filesystems/zonefs.txt
+++ b/Documentation/filesystems/zonefs.txt
@@ -134,7 +134,7 @@ Sequential zone files can only be written sequentially, starting from the file
  end, that is, write operations can only be append writes. Zonefs makes no
  attempt at accepting random writes and will fail any write request that has a
  start offset not corresponding to the end of the file, or to the end of the last
-write issued and still in-flight (for asynchrnous I/O operations).
+write issued and still in-flight (for asynchronous I/O operations).
  
  Since dirty page writeback by the page cache does not guarantee a sequential
  write pattern, zonefs prevents buffered writes and writeable shared mappings
@@ -142,7 +142,7 @@ on sequential files. Only direct I/O writes are accepted for these files.
  zonefs relies on the sequential delivery of write I/O requests to the device
  implemented by the block layer elevator. An elevator implementing the sequential
  write feature for zoned block device (ELEVATOR_F_ZBD_SEQ_WRITE elevator feature)
-must be used. This type of elevator (e.g. mq-deadline) is the set by default
+must be used. This type of elevator (e.g. mq-deadline) is set by default
  for zoned block devices on device initialization.
  
  There are no restrictions on the type of I/O used for read operations in
@@ -196,7 +196,7 @@ additional conditions that result in I/O errors.
    may still happen in the case of a partial failure of a very large direct I/O
    operation split into multiple BIOs/requests or asynchronous I/O operations.
    If one of the write request within the set of sequential write requests
-  issued to the device fails, all write requests after queued after it will
+  issued to the device fails, all write requests queued after it will
    become unaligned and fail.
  
  * Delayed write errors: similarly to regular block devices, if the device side
@@ -207,7 +207,7 @@ additional conditions that result in I/O errors.
    causing all data to be dropped after the sector that caused the error.
  
  All I/O errors detected by zonefs are notified to the user with an error code
-return for the system call that trigered or detected the error. The recovery
+return for the system call that triggered or detected the error. The recovery
  actions taken by zonefs in response to I/O errors depend on the I/O type (read
  vs write) and on the reason for the error (bad sector, unaligned writes or zone
  condition change).
@@ -222,7 +222,7 @@ condition change).
  * A zone condition change to read-only or offline also always triggers zonefs
    I/O error recovery.
  
-Zonefs minimal I/O error recovery may change a file size and a file access
+Zonefs minimal I/O error recovery may change a file size and file access
  permissions.
  
  * File size changes:
@@ -237,7 +237,7 @@ permissions.
    A file size may also be reduced to reflect a delayed write error detected on
    fsync(): in this case, the amount of data effectively written in the zone may
    be less than originally indicated by the file inode size. After such I/O
-  error, zonefs always fixes a file inode size to reflect the amount of data
+  error, zonefs always fixes the file inode size to reflect the amount of data
    persistently stored in the file zone.
  
  * Access permission changes:
@@ -281,11 +281,11 @@ Further notes:
    permissions to read-only applies to all files. The file system is remounted
    read-only.
  * Access permission and file size changes due to the device transitioning zones
-  to the offline condition are permanent. Remounting or reformating the device
+  to the offline condition are permanent. Remounting or reformatting the device
    with mkfs.zonefs (mkzonefs) will not change back offline zone files to a good
    state.
  * File access permission changes to read-only due to the device transitioning
-  zones to the read-only condition are permanent. Remounting or reformating
+  zones to the read-only condition are permanent. Remounting or reformatting
    the device will not re-enable file write access.
  * File access permission changes implied by the remount-ro, zone-ro and
    zone-offline mount options are temporary for zones in a good condition.
@@ -301,13 +301,13 @@ Mount options
  
  zonefs define the "errors=<behavior>" mount option to allow the user to specify
  zonefs behavior in response to I/O errors, inode size inconsistencies or zone
-condition chages. The defined behaviors are as follow:
+condition changes. The defined behaviors are as follow:
  * remount-ro (default)
  * zone-ro
  * zone-offline
  * repair
  
-The I/O error actions defined for each behavior is detailed in the previous
+The I/O error actions defined for each behavior are detailed in the previous
  section.
  
  Zonefs User Space Tools
diff --git a/Documentation/hwmon/xdpe12284.rst b/Documentation/hwmon/xdpe12284.rst

index 6b7ae98cc536f6382a947436f3ede87a3b112fdc..67d1f87808e57981a18c55c4877d6ab80659f5ce 100644 (file)
--- a/Documentation/hwmon/xdpe12284.rst
+++ b/Documentation/hwmon/xdpe12284.rst
@@ -24,6 +24,7 @@ This driver implements support for Infineon Multi-phase XDPE122 family
  dual loop voltage regulators.
  The family includes XDPE12284 and XDPE12254 devices.
  The devices from this family complaint with:
+
  - Intel VR13 and VR13HC rev 1.3, IMVP8 rev 1.2 and IMPVP9 rev 1.3 DC-DC
    converter specification.
  - Intel SVID rev 1.9. protocol.
diff --git a/Documentation/kbuild/makefiles.rst b/Documentation/kbuild/makefiles.rst

index 0e0eb2c8da7d541847ab2af34208f38c276898d5..6bc126a14b3d24cc9ba7f24a0975caecb1470153 100644 (file)
--- a/Documentation/kbuild/makefiles.rst
+++ b/Documentation/kbuild/makefiles.rst
@@ -765,7 +765,7 @@ is not sufficient this sometimes needs to be explicit.
         Example::
  
                 #arch/x86/boot/Makefile
-               subdir- := compressed/
+               subdir- := compressed
  
  The above assignment instructs kbuild to descend down in the
  directory compressed/ when "make clean" is executed.
@@ -1379,9 +1379,6 @@ See subsequent chapter for the syntax of the Kbuild file.
         in arch/$(ARCH)/include/(uapi/)/asm, Kbuild will automatically generate
         a wrapper of the asm-generic one.
  
-       The convention is to list one subdir per line and
-       preferably in alphabetic order.
-
  8 Kbuild Variables
  ==================
  
diff --git a/Documentation/networking/phy.rst b/Documentation/networking/phy.rst

index 1e4735cc055351ff81bee0541dc221d974547bb1..256106054c8cb0b9d7fdc0a01f35100a388e6f08 100644 (file)
--- a/Documentation/networking/phy.rst
+++ b/Documentation/networking/phy.rst
@@ -487,8 +487,9 @@ phy_register_fixup_for_id()::
  The stubs set one of the two matching criteria, and set the other one to
  match anything.
  
-When phy_register_fixup() or \*_for_uid()/\*_for_id() is called at module,
-unregister fixup and free allocate memory are required.
+When phy_register_fixup() or \*_for_uid()/\*_for_id() is called at module load
+time, the module needs to unregister the fixup and free allocated memory when
+it's unloaded.
  
  Call one of following function before unloading module::
  
diff --git a/Documentation/power/index.rst b/Documentation/power/index.rst

index 002e42745263a6416de98bb93532786ab444b5cf..ced8a80074348ec744d7bedff3ce681a0162d94b 100644 (file)
--- a/Documentation/power/index.rst
+++ b/Documentation/power/index.rst
@@ -13,7 +13,6 @@ Power Management
      drivers-testing
      energy-model
      freezing-of-tasks
-    interface
      opp
      pci
      pm_qos_interface
diff --git a/Documentation/process/embargoed-hardware-issues.rst b/Documentation/process/embargoed-hardware-issues.rst

index 33edae6545994830942c317a1aa2f8ee8666f407..a19d084f9b2cdefd2fd90810bb9b900b64c0f68b 100644 (file)
--- a/Documentation/process/embargoed-hardware-issues.rst
+++ b/Documentation/process/embargoed-hardware-issues.rst
@@ -244,23 +244,23 @@ disclosure of a particular issue, unless requested by a response team or by
  an involved disclosed party. The current ambassadors list:
  
    ============= ========================================================
-  ARM
+  ARM           Grant Likely <grant.likely@arm.com>
    AMD          Tom Lendacky <tom.lendacky@amd.com>
    IBM
    Intel                Tony Luck <tony.luck@intel.com>
    Qualcomm     Trilok Soni <tsoni@codeaurora.org>
  
-  Microsoft    Sasha Levin <sashal@kernel.org>
+  Microsoft    James Morris <jamorris@linux.microsoft.com>
    VMware
    Xen          Andrew Cooper <andrew.cooper3@citrix.com>
  
-  Canonical    Tyler Hicks <tyhicks@canonical.com>
+  Canonical    John Johansen <john.johansen@canonical.com>
    Debian       Ben Hutchings <ben@decadent.org.uk>
    Oracle       Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Red Hat      Josh Poimboeuf <jpoimboe@redhat.com>
    SUSE         Jiri Kosina <jkosina@suse.cz>
  
-  Amazon       Peter Bowen <pzb@amzn.com>
+  Amazon
    Google       Kees Cook <keescook@chromium.org>
    ============= ========================================================
  
diff --git a/Documentation/sphinx/parallel-wrapper.sh b/Documentation/sphinx/parallel-wrapper.sh

index 7daf5133bdd3144df4f9ae7c624effabaa9e3c01..e54c44ce117d51835f7aeaba820ca48256681915 100644 (file)
--- a/Documentation/sphinx/parallel-wrapper.sh
+++ b/Documentation/sphinx/parallel-wrapper.sh
@@ -30,4 +30,4 @@ if [ -n "$parallel" ] ; then
         parallel="-j$parallel"
  fi
  
-exec "$sphinx" "$parallel" "$@"
+exec "$sphinx" $parallel "$@"
diff --git a/Documentation/translations/zh_CN/process/embargoed-hardware-issues.rst b/Documentation/translations/zh_CN/process/embargoed-hardware-issues.rst

index b93f1af6826131fe9f1996f8ea1e8fa35d785871..88273ebe7823d4512fdd59e1823876cbfba0a66c 100644 (file)
--- a/Documentation/translations/zh_CN/process/embargoed-hardware-issues.rst
+++ b/Documentation/translations/zh_CN/process/embargoed-hardware-issues.rst
@@ -183,7 +183,7 @@ CVE分配
    VMware
    Xen          Andrew Cooper <andrew.cooper3@citrix.com>
  
-  Canonical    Tyler Hicks <tyhicks@canonical.com>
+  Canonical    John Johansen <john.johansen@canonical.com>
    Debian       Ben Hutchings <ben@decadent.org.uk>
    Oracle       Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Red Hat      Josh Poimboeuf <jpoimboe@redhat.com>
diff --git a/Documentation/virtual/guest-halt-polling.txt b/Documentation/virt/guest-halt-polling.rst

similarity index 91%

rename from Documentation/virtual/guest-halt-polling.txt

rename to Documentation/virt/guest-halt-polling.rst

index b3a2a294532da0729a101b9ebb2751f6676ddf4b..b4e747942417ad3b94e83fdf5e264ffba71e98ba 100644 (file)
--- a/Documentation/virtual/guest-halt-polling.txt
+++ b/Documentation/virt/guest-halt-polling.rst
@@ -1,9 +1,11 @@
+==================
  Guest halt polling
  ==================
  
  The cpuidle_haltpoll driver, with the haltpoll governor, allows
  the guest vcpus to poll for a specified amount of time before
  halting.
+
  This provides the following benefits to host side polling:
  
         1) The POLL flag is set while polling is performed, which allows
@@ -29,18 +31,21 @@ Module Parameters
  The haltpoll governor has 5 tunable module parameters:
  
  1) guest_halt_poll_ns:
+
  Maximum amount of time, in nanoseconds, that polling is
  performed before halting.
  
  Default: 200000
  
  2) guest_halt_poll_shrink:
+
  Division factor used to shrink per-cpu guest_halt_poll_ns when
  wakeup event occurs after the global guest_halt_poll_ns.
  
  Default: 2
  
  3) guest_halt_poll_grow:
+
  Multiplication factor used to grow per-cpu guest_halt_poll_ns
  when event occurs after per-cpu guest_halt_poll_ns
  but before global guest_halt_poll_ns.
@@ -48,6 +53,7 @@ but before global guest_halt_poll_ns.
  Default: 2
  
  4) guest_halt_poll_grow_start:
+
  The per-cpu guest_halt_poll_ns eventually reaches zero
  in case of an idle system. This value sets the initial
  per-cpu guest_halt_poll_ns when growing. This can
@@ -66,7 +72,7 @@ high once achieves global guest_halt_poll_ns value).
  
  Default: Y
  
-The module parameters can be set from the debugfs files in:
+The module parameters can be set from the debugfs files in::
  
         /sys/module/haltpoll/parameters/
  
@@ -74,5 +80,5 @@ Further Notes
  =============
  
  - Care should be taken when setting the guest_halt_poll_ns parameter as a
-large value has the potential to drive the cpu usage to 100% on a machine which
-would be almost entirely idle otherwise.
+  large value has the potential to drive the cpu usage to 100% on a machine
+  which would be almost entirely idle otherwise.
diff --git a/Documentation/virt/index.rst b/Documentation/virt/index.rst

index 062ffb5270438740f007479c33b022e5287580ce..de1ab81df95802b2a8a5b34e6b05ef98d422f6a0 100644 (file)
--- a/Documentation/virt/index.rst
+++ b/Documentation/virt/index.rst
@@ -8,7 +8,9 @@ Linux Virtualization Support
     :maxdepth: 2
  
     kvm/index
+   uml/user_mode_linux
     paravirt_ops
+   guest-halt-polling
  
  .. only:: html and subproject
  
diff --git a/Documentation/virt/kvm/api.txt b/Documentation/virt/kvm/api.rst

similarity index 71%

rename from Documentation/virt/kvm/api.txt

rename to Documentation/virt/kvm/api.rst

index c6e1ce5d40de992df7acd2c3782d619fd0838721..ebd383fba9399d02b7acea47b1ccbbf333949dad 100644 (file)
--- a/Documentation/virt/kvm/api.txt
+++ b/Documentation/virt/kvm/api.rst
@@ -1,8 +1,11 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================================================================
  The Definitive KVM (Kernel-based Virtual Machine) API Documentation
  ===================================================================
  
  1. General description
-----------------------
+======================
  
  The kvm API is a set of ioctls that are issued to control various aspects
  of a virtual machine.  The ioctls belong to the following classes:
@@ -33,7 +36,7 @@ of a virtual machine.  The ioctls belong to the following classes:
     was used to create the VM.
  
  2. File descriptors
--------------------
+===================
  
  The kvm API is centered around file descriptors.  An initial
  open("/dev/kvm") obtains a handle to the kvm subsystem; this handle
@@ -70,7 +73,7 @@ the VM is shut down.
  
  
  3. Extensions
--------------
+=============
  
  As of Linux 2.6.22, the KVM ABI has been stabilized: no backward
  incompatible change are allowed.  However, there is an extension
@@ -84,13 +87,14 @@ set of ioctls is available for application use.
  
  
  4. API description
-------------------
+==================
  
  This section describes ioctls that can be used to control kvm guests.
  For each ioctl, the following information is provided along with a
  description:
  
-  Capability: which KVM extension provides this ioctl.  Can be 'basic',
+  Capability:
+      which KVM extension provides this ioctl.  Can be 'basic',
        which means that is will be provided by any kernel that supports
        API version 12 (see section 4.1), a KVM_CAP_xyz constant, which
        means availability needs to be checked with KVM_CHECK_EXTENSION
@@ -99,24 +103,29 @@ description:
        availability: for kernels that don't support the ioctl,
        the ioctl returns -ENOTTY.
  
-  Architectures: which instruction set architectures provide this ioctl.
+  Architectures:
+      which instruction set architectures provide this ioctl.
        x86 includes both i386 and x86_64.
  
-  Type: system, vm, or vcpu.
+  Type:
+      system, vm, or vcpu.
  
-  Parameters: what parameters are accepted by the ioctl.
+  Parameters:
+      what parameters are accepted by the ioctl.
  
-  Returns: the return value.  General error numbers (EBADF, ENOMEM, EINVAL)
+  Returns:
+      the return value.  General error numbers (EBADF, ENOMEM, EINVAL)
        are not detailed, but errors with specific meanings are.
  
  
  4.1 KVM_GET_API_VERSION
+-----------------------
  
-Capability: basic
-Architectures: all
-Type: system ioctl
-Parameters: none
-Returns: the constant KVM_API_VERSION (=12)
+:Capability: basic
+:Architectures: all
+:Type: system ioctl
+:Parameters: none
+:Returns: the constant KVM_API_VERSION (=12)
  
  This identifies the API version as the stable kvm API. It is not
  expected that this number will change.  However, Linux 2.6.20 and
@@ -127,12 +136,13 @@ described as 'basic' will be available.
  
  
  4.2 KVM_CREATE_VM
+-----------------
  
-Capability: basic
-Architectures: all
-Type: system ioctl
-Parameters: machine type identifier (KVM_VM_*)
-Returns: a VM fd that can be used to control the new virtual machine.
+:Capability: basic
+:Architectures: all
+:Type: system ioctl
+:Parameters: machine type identifier (KVM_VM_*)
+:Returns: a VM fd that can be used to control the new virtual machine.
  
  The new VM has no virtual cpus and no memory.
  You probably want to use 0 as machine type.
@@ -155,17 +165,17 @@ identifier, where IPA_Bits is the maximum width of any physical
  address used by the VM. The IPA_Bits is encoded in bits[7-0] of the
  machine type identifier.
  
-e.g, to configure a guest to use 48bit physical address size :
+e.g, to configure a guest to use 48bit physical address size::
  
      vm_fd = ioctl(dev_fd, KVM_CREATE_VM, KVM_VM_TYPE_ARM_IPA_SIZE(48));
  
-The requested size (IPA_Bits) must be :
-  0 - Implies default size, 40bits (for backward compatibility)
+The requested size (IPA_Bits) must be:
  
-  or
-
-  N - Implies N bits, where N is a positive integer such that,
+ ==   =========================================================
+  0   Implies default size, 40bits (for backward compatibility)
+  N   Implies N bits, where N is a positive integer such that,
        32 <= N <= Host_IPA_Limit
+ ==   =========================================================
  
  Host_IPA_Limit is the maximum possible value for IPA_Bits on the host and
  is dependent on the CPU capability and the kernel configuration. The limit can
@@ -179,21 +189,28 @@ host physical address translations).
  
  
  4.3 KVM_GET_MSR_INDEX_LIST, KVM_GET_MSR_FEATURE_INDEX_LIST
+----------------------------------------------------------
+
+:Capability: basic, KVM_CAP_GET_MSR_FEATURES for KVM_GET_MSR_FEATURE_INDEX_LIST
+:Architectures: x86
+:Type: system ioctl
+:Parameters: struct kvm_msr_list (in/out)
+:Returns: 0 on success; -1 on error
  
-Capability: basic, KVM_CAP_GET_MSR_FEATURES for KVM_GET_MSR_FEATURE_INDEX_LIST
-Architectures: x86
-Type: system ioctl
-Parameters: struct kvm_msr_list (in/out)
-Returns: 0 on success; -1 on error
  Errors:
-  EFAULT:    the msr index list cannot be read from or written to
-  E2BIG:     the msr index list is to be to fit in the array specified by
+
+  ======     ============================================================
+  EFAULT     the msr index list cannot be read from or written to
+  E2BIG      the msr index list is to be to fit in the array specified by
               the user.
+  ======     ============================================================
  
-struct kvm_msr_list {
+::
+
+  struct kvm_msr_list {
         __u32 nmsrs; /* number of msrs in entries */
         __u32 indices[0];
-};
+  };
  
  The user fills in the size of the indices array in nmsrs, and in return
  kvm adjusts nmsrs to reflect the actual number of msrs and fills in the
@@ -214,12 +231,13 @@ otherwise.
  
  
  4.4 KVM_CHECK_EXTENSION
+-----------------------
  
-Capability: basic, KVM_CAP_CHECK_EXTENSION_VM for vm ioctl
-Architectures: all
-Type: system ioctl, vm ioctl
-Parameters: extension identifier (KVM_CAP_*)
-Returns: 0 if unsupported; 1 (or some other positive integer) if supported
+:Capability: basic, KVM_CAP_CHECK_EXTENSION_VM for vm ioctl
+:Architectures: all
+:Type: system ioctl, vm ioctl
+:Parameters: extension identifier (KVM_CAP_*)
+:Returns: 0 if unsupported; 1 (or some other positive integer) if supported
  
  The API allows the application to query about extensions to the core
  kvm API.  Userspace passes an extension identifier (an integer) and
@@ -232,12 +250,13 @@ It is thus encouraged to use the vm ioctl to query for capabilities (available
  with KVM_CAP_CHECK_EXTENSION_VM on the vm fd)
  
  4.5 KVM_GET_VCPU_MMAP_SIZE
+--------------------------
  
-Capability: basic
-Architectures: all
-Type: system ioctl
-Parameters: none
-Returns: size of vcpu mmap area, in bytes
+:Capability: basic
+:Architectures: all
+:Type: system ioctl
+:Parameters: none
+:Returns: size of vcpu mmap area, in bytes
  
  The KVM_RUN ioctl (cf.) communicates with userspace via a shared
  memory region.  This ioctl returns the size of that region.  See the
@@ -245,23 +264,25 @@ KVM_RUN documentation for details.
  
  
  4.6 KVM_SET_MEMORY_REGION
+-------------------------
  
-Capability: basic
-Architectures: all
-Type: vm ioctl
-Parameters: struct kvm_memory_region (in)
-Returns: 0 on success, -1 on error
+:Capability: basic
+:Architectures: all
+:Type: vm ioctl
+:Parameters: struct kvm_memory_region (in)
+:Returns: 0 on success, -1 on error
  
  This ioctl is obsolete and has been removed.
  
  
  4.7 KVM_CREATE_VCPU
+-------------------
  
-Capability: basic
-Architectures: all
-Type: vm ioctl
-Parameters: vcpu id (apic id on x86)
-Returns: vcpu fd on success, -1 on error
+:Capability: basic
+:Architectures: all
+:Type: vm ioctl
+:Parameters: vcpu id (apic id on x86)
+:Returns: vcpu fd on success, -1 on error
  
  This API adds a vcpu to a virtual machine. No more than max_vcpus may be added.
  The vcpu id is an integer in the range [0, max_vcpu_id).
@@ -302,22 +323,25 @@ cpu's hardware control block.
  
  
  4.8 KVM_GET_DIRTY_LOG (vm ioctl)
+--------------------------------
  
-Capability: basic
-Architectures: all
-Type: vm ioctl
-Parameters: struct kvm_dirty_log (in/out)
-Returns: 0 on success, -1 on error
+:Capability: basic
+:Architectures: all
+:Type: vm ioctl
+:Parameters: struct kvm_dirty_log (in/out)
+:Returns: 0 on success, -1 on error
  
-/* for KVM_GET_DIRTY_LOG */
-struct kvm_dirty_log {
+::
+
+  /* for KVM_GET_DIRTY_LOG */
+  struct kvm_dirty_log {
         __u32 slot;
         __u32 padding;
         union {
                 void __user *dirty_bitmap; /* one bit per page */
                 __u64 padding;
         };
-};
+  };
  
  Given a memory slot, return a bitmap containing any pages dirtied
  since the last call to this ioctl.  Bit 0 is the first page in the
@@ -334,25 +358,31 @@ KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is enabled.  For more information,
  see the description of the capability.
  
  4.9 KVM_SET_MEMORY_ALIAS
+------------------------
  
-Capability: basic
-Architectures: x86
-Type: vm ioctl
-Parameters: struct kvm_memory_alias (in)
-Returns: 0 (success), -1 (error)
+:Capability: basic
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: struct kvm_memory_alias (in)
+:Returns: 0 (success), -1 (error)
  
  This ioctl is obsolete and has been removed.
  
  
  4.10 KVM_RUN
+------------
+
+:Capability: basic
+:Architectures: all
+:Type: vcpu ioctl
+:Parameters: none
+:Returns: 0 on success, -1 on error
  
-Capability: basic
-Architectures: all
-Type: vcpu ioctl
-Parameters: none
-Returns: 0 on success, -1 on error
  Errors:
-  EINTR:     an unmasked signal is pending
+
+  =====      =============================
+  EINTR      an unmasked signal is pending
+  =====      =============================
  
  This ioctl is used to run a guest virtual cpu.  While there are no
  explicit parameters, there is an implicit parameter block that can be
@@ -362,42 +392,46 @@ kvm_run' (see below).
  
  
  4.11 KVM_GET_REGS
+-----------------
  
-Capability: basic
-Architectures: all except ARM, arm64
-Type: vcpu ioctl
-Parameters: struct kvm_regs (out)
-Returns: 0 on success, -1 on error
+:Capability: basic
+:Architectures: all except ARM, arm64
+:Type: vcpu ioctl
+:Parameters: struct kvm_regs (out)
+:Returns: 0 on success, -1 on error
  
  Reads the general purpose registers from the vcpu.
  
-/* x86 */
-struct kvm_regs {
+::
+
+  /* x86 */
+  struct kvm_regs {
         /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */
         __u64 rax, rbx, rcx, rdx;
         __u64 rsi, rdi, rsp, rbp;
         __u64 r8,  r9,  r10, r11;
         __u64 r12, r13, r14, r15;
         __u64 rip, rflags;
-};
+  };
  
-/* mips */
-struct kvm_regs {
+  /* mips */
+  struct kvm_regs {
         /* out (KVM_GET_REGS) / in (KVM_SET_REGS) */
         __u64 gpr[32];
         __u64 hi;
         __u64 lo;
         __u64 pc;
-};
+  };
  
  
  4.12 KVM_SET_REGS
+-----------------
  
-Capability: basic
-Architectures: all except ARM, arm64
-Type: vcpu ioctl
-Parameters: struct kvm_regs (in)
-Returns: 0 on success, -1 on error
+:Capability: basic
+:Architectures: all except ARM, arm64
+:Type: vcpu ioctl
+:Parameters: struct kvm_regs (in)
+:Returns: 0 on success, -1 on error
  
  Writes the general purpose registers into the vcpu.
  
@@ -405,17 +439,20 @@ See KVM_GET_REGS for the data structure.
  
  
  4.13 KVM_GET_SREGS
+------------------
  
-Capability: basic
-Architectures: x86, ppc
-Type: vcpu ioctl
-Parameters: struct kvm_sregs (out)
-Returns: 0 on success, -1 on error
+:Capability: basic
+:Architectures: x86, ppc
+:Type: vcpu ioctl
+:Parameters: struct kvm_sregs (out)
+:Returns: 0 on success, -1 on error
  
  Reads special registers from the vcpu.
  
-/* x86 */
-struct kvm_sregs {
+::
+
+  /* x86 */
+  struct kvm_sregs {
         struct kvm_segment cs, ds, es, fs, gs, ss;
         struct kvm_segment tr, ldt;
         struct kvm_dtable gdt, idt;
@@ -423,9 +460,9 @@ struct kvm_sregs {
         __u64 efer;
         __u64 apic_base;
         __u64 interrupt_bitmap[(KVM_NR_INTERRUPTS + 63) / 64];
-};
+  };
  
-/* ppc -- see arch/powerpc/include/uapi/asm/kvm.h */
+  /* ppc -- see arch/powerpc/include/uapi/asm/kvm.h */
  
  interrupt_bitmap is a bitmap of pending external interrupts.  At most
  one bit may be set.  This interrupt has been acknowledged by the APIC
@@ -433,29 +470,33 @@ but not yet injected into the cpu core.
  
  
  4.14 KVM_SET_SREGS
+------------------
  
-Capability: basic
-Architectures: x86, ppc
-Type: vcpu ioctl
-Parameters: struct kvm_sregs (in)
-Returns: 0 on success, -1 on error
+:Capability: basic
+:Architectures: x86, ppc
+:Type: vcpu ioctl
+:Parameters: struct kvm_sregs (in)
+:Returns: 0 on success, -1 on error
  
  Writes special registers into the vcpu.  See KVM_GET_SREGS for the
  data structures.
  
  
  4.15 KVM_TRANSLATE
+------------------
  
-Capability: basic
-Architectures: x86
-Type: vcpu ioctl
-Parameters: struct kvm_translation (in/out)
-Returns: 0 on success, -1 on error
+:Capability: basic
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_translation (in/out)
+:Returns: 0 on success, -1 on error
  
  Translates a virtual address according to the vcpu's current address
  translation mode.
  
-struct kvm_translation {
+::
+
+  struct kvm_translation {
         /* in */
         __u64 linear_address;
  
@@ -465,59 +506,68 @@ struct kvm_translation {
         __u8  writeable;
         __u8  usermode;
         __u8  pad[5];
-};
+  };
  
  
  4.16 KVM_INTERRUPT
+------------------
  
-Capability: basic
-Architectures: x86, ppc, mips
-Type: vcpu ioctl
-Parameters: struct kvm_interrupt (in)
-Returns: 0 on success, negative on failure.
+:Capability: basic
+:Architectures: x86, ppc, mips
+:Type: vcpu ioctl
+:Parameters: struct kvm_interrupt (in)
+:Returns: 0 on success, negative on failure.
  
  Queues a hardware interrupt vector to be injected.
  
-/* for KVM_INTERRUPT */
-struct kvm_interrupt {
+::
+
+  /* for KVM_INTERRUPT */
+  struct kvm_interrupt {
         /* in */
         __u32 irq;
-};
+  };
  
  X86:
+^^^^
+
+:Returns:
  
-Returns: 0 on success,
-        -EEXIST if an interrupt is already enqueued
-        -EINVAL the the irq number is invalid
-        -ENXIO if the PIC is in the kernel
-        -EFAULT if the pointer is invalid
+       ========= ===================================
+         0       on success,
+        -EEXIST  if an interrupt is already enqueued
+        -EINVAL  the the irq number is invalid
+        -ENXIO   if the PIC is in the kernel
+        -EFAULT  if the pointer is invalid
+       ========= ===================================
  
  Note 'irq' is an interrupt vector, not an interrupt pin or line. This
  ioctl is useful if the in-kernel PIC is not used.
  
  PPC:
+^^^^
  
  Queues an external interrupt to be injected. This ioctl is overleaded
  with 3 different irq values:
  
  a) KVM_INTERRUPT_SET
  
-  This injects an edge type external interrupt into the guest once it's ready
-  to receive interrupts. When injected, the interrupt is done.
+   This injects an edge type external interrupt into the guest once it's ready
+   to receive interrupts. When injected, the interrupt is done.
  
  b) KVM_INTERRUPT_UNSET
  
-  This unsets any pending interrupt.
+   This unsets any pending interrupt.
  
-  Only available with KVM_CAP_PPC_UNSET_IRQ.
+   Only available with KVM_CAP_PPC_UNSET_IRQ.
  
  c) KVM_INTERRUPT_SET_LEVEL
  
-  This injects a level type external interrupt into the guest context. The
-  interrupt stays pending until a specific ioctl with KVM_INTERRUPT_UNSET
-  is triggered.
+   This injects a level type external interrupt into the guest context. The
+   interrupt stays pending until a specific ioctl with KVM_INTERRUPT_UNSET
+   is triggered.
  
-  Only available with KVM_CAP_PPC_IRQ_LEVEL.
+   Only available with KVM_CAP_PPC_IRQ_LEVEL.
  
  Note that any value for 'irq' other than the ones stated above is invalid
  and incurs unexpected behavior.
@@ -525,6 +575,7 @@ and incurs unexpected behavior.
  This is an asynchronous vcpu ioctl and can be invoked from any thread.
  
  MIPS:
+^^^^^
  
  Queues an external interrupt to be injected into the virtual CPU. A negative
  interrupt number dequeues the interrupt.
@@ -533,24 +584,26 @@ This is an asynchronous vcpu ioctl and can be invoked from any thread.
  
  
  4.17 KVM_DEBUG_GUEST
+--------------------
  
-Capability: basic
-Architectures: none
-Type: vcpu ioctl
-Parameters: none)
-Returns: -1 on error
+:Capability: basic
+:Architectures: none
+:Type: vcpu ioctl
+:Parameters: none)
+:Returns: -1 on error
  
  Support for this has been removed.  Use KVM_SET_GUEST_DEBUG instead.
  
  
  4.18 KVM_GET_MSRS
+-----------------
  
-Capability: basic (vcpu), KVM_CAP_GET_MSR_FEATURES (system)
-Architectures: x86
-Type: system ioctl, vcpu ioctl
-Parameters: struct kvm_msrs (in/out)
-Returns: number of msrs successfully returned;
-        -1 on error
+:Capability: basic (vcpu), KVM_CAP_GET_MSR_FEATURES (system)
+:Architectures: x86
+:Type: system ioctl, vcpu ioctl
+:Parameters: struct kvm_msrs (in/out)
+:Returns: number of msrs successfully returned;
+          -1 on error
  
  When used as a system ioctl:
  Reads the values of MSR-based features that are available for the VM.  This
@@ -562,18 +615,20 @@ When used as a vcpu ioctl:
  Reads model-specific registers from the vcpu.  Supported msr indices can
  be obtained using KVM_GET_MSR_INDEX_LIST in a system ioctl.
  
-struct kvm_msrs {
+::
+
+  struct kvm_msrs {
         __u32 nmsrs; /* number of msrs in entries */
         __u32 pad;
  
         struct kvm_msr_entry entries[0];
-};
+  };
  
-struct kvm_msr_entry {
+  struct kvm_msr_entry {
         __u32 index;
         __u32 reserved;
         __u64 data;
-};
+  };
  
  Application code should set the 'nmsrs' member (which indicates the
  size of the entries array) and the 'index' member of each array entry.
@@ -581,12 +636,13 @@ kvm will fill in the 'data' member.
  
  
  4.19 KVM_SET_MSRS
+-----------------
  
-Capability: basic
-Architectures: x86
-Type: vcpu ioctl
-Parameters: struct kvm_msrs (in)
-Returns: number of msrs successfully set (see below), -1 on error
+:Capability: basic
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_msrs (in)
+:Returns: number of msrs successfully set (see below), -1 on error
  
  Writes model-specific registers to the vcpu.  See KVM_GET_MSRS for the
  data structures.
@@ -602,41 +658,44 @@ MSRs that have been set successfully.
  
  
  4.20 KVM_SET_CPUID
+------------------
  
-Capability: basic
-Architectures: x86
-Type: vcpu ioctl
-Parameters: struct kvm_cpuid (in)
-Returns: 0 on success, -1 on error
+:Capability: basic
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_cpuid (in)
+:Returns: 0 on success, -1 on error
  
  Defines the vcpu responses to the cpuid instruction.  Applications
  should use the KVM_SET_CPUID2 ioctl if available.
  
+::
  
-struct kvm_cpuid_entry {
+  struct kvm_cpuid_entry {
         __u32 function;
         __u32 eax;
         __u32 ebx;
         __u32 ecx;
         __u32 edx;
         __u32 padding;
-};
+  };
  
-/* for KVM_SET_CPUID */
-struct kvm_cpuid {
+  /* for KVM_SET_CPUID */
+  struct kvm_cpuid {
         __u32 nent;
         __u32 padding;
         struct kvm_cpuid_entry entries[0];
-};
+  };
  
  
  4.21 KVM_SET_SIGNAL_MASK
+------------------------
  
-Capability: basic
-Architectures: all
-Type: vcpu ioctl
-Parameters: struct kvm_signal_mask (in)
-Returns: 0 on success, -1 on error
+:Capability: basic
+:Architectures: all
+:Type: vcpu ioctl
+:Parameters: struct kvm_signal_mask (in)
+:Returns: 0 on success, -1 on error
  
  Defines which signals are blocked during execution of KVM_RUN.  This
  signal mask temporarily overrides the threads signal mask.  Any
@@ -646,25 +705,30 @@ their traditional behaviour) will cause KVM_RUN to return with -EINTR.
  Note the signal will only be delivered if not blocked by the original
  signal mask.
  
-/* for KVM_SET_SIGNAL_MASK */
-struct kvm_signal_mask {
+::
+
+  /* for KVM_SET_SIGNAL_MASK */
+  struct kvm_signal_mask {
         __u32 len;
         __u8  sigset[0];
-};
+  };
  
  
  4.22 KVM_GET_FPU
+----------------
  
-Capability: basic
-Architectures: x86
-Type: vcpu ioctl
-Parameters: struct kvm_fpu (out)
-Returns: 0 on success, -1 on error
+:Capability: basic
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_fpu (out)
+:Returns: 0 on success, -1 on error
  
  Reads the floating point state from the vcpu.
  
-/* for KVM_GET_FPU and KVM_SET_FPU */
-struct kvm_fpu {
+::
+
+  /* for KVM_GET_FPU and KVM_SET_FPU */
+  struct kvm_fpu {
         __u8  fpr[8][16];
         __u16 fcw;
         __u16 fsw;
@@ -676,21 +740,24 @@ struct kvm_fpu {
         __u8  xmm[16][16];
         __u32 mxcsr;
         __u32 pad2;
-};
+  };
  
  
  4.23 KVM_SET_FPU
+----------------
  
-Capability: basic
-Architectures: x86
-Type: vcpu ioctl
-Parameters: struct kvm_fpu (in)
-Returns: 0 on success, -1 on error
+:Capability: basic
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_fpu (in)
+:Returns: 0 on success, -1 on error
  
  Writes the floating point state to the vcpu.
  
-/* for KVM_GET_FPU and KVM_SET_FPU */
-struct kvm_fpu {
+::
+
+  /* for KVM_GET_FPU and KVM_SET_FPU */
+  struct kvm_fpu {
         __u8  fpr[8][16];
         __u16 fcw;
         __u16 fsw;
@@ -702,16 +769,17 @@ struct kvm_fpu {
         __u8  xmm[16][16];
         __u32 mxcsr;
         __u32 pad2;
-};
+  };
  
  
  4.24 KVM_CREATE_IRQCHIP
+-----------------------
  
-Capability: KVM_CAP_IRQCHIP, KVM_CAP_S390_IRQCHIP (s390)
-Architectures: x86, ARM, arm64, s390
-Type: vm ioctl
-Parameters: none
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_IRQCHIP, KVM_CAP_S390_IRQCHIP (s390)
+:Architectures: x86, ARM, arm64, s390
+:Type: vm ioctl
+:Parameters: none
+:Returns: 0 on success, -1 on error
  
  Creates an interrupt controller model in the kernel.
  On x86, creates a virtual ioapic, a virtual PIC (two PICs, nested), and sets up
@@ -727,12 +795,13 @@ before KVM_CREATE_IRQCHIP can be used.
  
  
  4.25 KVM_IRQ_LINE
+-----------------
  
-Capability: KVM_CAP_IRQCHIP
-Architectures: x86, arm, arm64
-Type: vm ioctl
-Parameters: struct kvm_irq_level
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_IRQCHIP
+:Architectures: x86, arm, arm64
+:Type: vm ioctl
+:Parameters: struct kvm_irq_level
+:Returns: 0 on success, -1 on error
  
  Sets the level of a GSI input to the interrupt controller model in the kernel.
  On some architectures it is required that an interrupt controller model has
@@ -756,16 +825,20 @@ of course).
  ARM/arm64 can signal an interrupt either at the CPU level, or at the
  in-kernel irqchip (GIC), and for in-kernel irqchip can tell the GIC to
  use PPIs designated for specific cpus.  The irq field is interpreted
-like this:
+like this::
  
    bits:  |  31 ... 28  | 27 ... 24 | 23  ... 16 | 15 ... 0 |
    field: | vcpu2_index | irq_type  | vcpu_index |  irq_id  |
  
  The irq_type field has the following values:
-- irq_type[0]: out-of-kernel GIC: irq_id 0 is IRQ, irq_id 1 is FIQ
-- irq_type[1]: in-kernel GIC: SPI, irq_id between 32 and 1019 (incl.)
+
+- irq_type[0]:
+              out-of-kernel GIC: irq_id 0 is IRQ, irq_id 1 is FIQ
+- irq_type[1]:
+              in-kernel GIC: SPI, irq_id between 32 and 1019 (incl.)
                 (the vcpu_index field is ignored)
-- irq_type[2]: in-kernel GIC: PPI, irq_id between 16 and 31 (incl.)
+- irq_type[2]:
+              in-kernel GIC: PPI, irq_id between 16 and 31 (incl.)
  
  (The irq_id field thus corresponds nicely to the IRQ ID in the ARM GIC specs)
  
@@ -779,27 +852,32 @@ Note that on arm/arm64, the KVM_CAP_IRQCHIP capability only conditions
  injection of interrupts for the in-kernel irqchip. KVM_IRQ_LINE can always
  be used for a userspace interrupt controller.
  
-struct kvm_irq_level {
+::
+
+  struct kvm_irq_level {
         union {
                 __u32 irq;     /* GSI */
                 __s32 status;  /* not used for KVM_IRQ_LEVEL */
         };
         __u32 level;           /* 0 or 1 */
-};
+  };
  
  
  4.26 KVM_GET_IRQCHIP
+--------------------
  
-Capability: KVM_CAP_IRQCHIP
-Architectures: x86
-Type: vm ioctl
-Parameters: struct kvm_irqchip (in/out)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_IRQCHIP
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: struct kvm_irqchip (in/out)
+:Returns: 0 on success, -1 on error
  
  Reads the state of a kernel interrupt controller created with
  KVM_CREATE_IRQCHIP into a buffer provided by the caller.
  
-struct kvm_irqchip {
+::
+
+  struct kvm_irqchip {
         __u32 chip_id;  /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */
         __u32 pad;
          union {
@@ -807,21 +885,24 @@ struct kvm_irqchip {
                 struct kvm_pic_state pic;
                 struct kvm_ioapic_state ioapic;
         } chip;
-};
+  };
  
  
  4.27 KVM_SET_IRQCHIP
+--------------------
  
-Capability: KVM_CAP_IRQCHIP
-Architectures: x86
-Type: vm ioctl
-Parameters: struct kvm_irqchip (in)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_IRQCHIP
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: struct kvm_irqchip (in)
+:Returns: 0 on success, -1 on error
  
  Sets the state of a kernel interrupt controller created with
  KVM_CREATE_IRQCHIP from a buffer provided by the caller.
  
-struct kvm_irqchip {
+::
+
+  struct kvm_irqchip {
         __u32 chip_id;  /* 0 = PIC1, 1 = PIC2, 2 = IOAPIC */
         __u32 pad;
          union {
@@ -829,16 +910,17 @@ struct kvm_irqchip {
                 struct kvm_pic_state pic;
                 struct kvm_ioapic_state ioapic;
         } chip;
-};
+  };
  
  
  4.28 KVM_XEN_HVM_CONFIG
+-----------------------
  
-Capability: KVM_CAP_XEN_HVM
-Architectures: x86
-Type: vm ioctl
-Parameters: struct kvm_xen_hvm_config (in)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_XEN_HVM
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: struct kvm_xen_hvm_config (in)
+:Returns: 0 on success, -1 on error
  
  Sets the MSR that the Xen HVM guest uses to initialize its hypercall
  page, and provides the starting address and size of the hypercall
@@ -846,7 +928,9 @@ blobs in userspace.  When the guest writes the MSR, kvm copies one
  page of a blob (32- or 64-bit, depending on the vcpu mode) to guest
  memory.
  
-struct kvm_xen_hvm_config {
+::
+
+  struct kvm_xen_hvm_config {
         __u32 flags;
         __u32 msr;
         __u64 blob_addr_32;
@@ -854,16 +938,17 @@ struct kvm_xen_hvm_config {
         __u8 blob_size_32;
         __u8 blob_size_64;
         __u8 pad2[30];
-};
+  };
  
  
  4.29 KVM_GET_CLOCK
+------------------
  
-Capability: KVM_CAP_ADJUST_CLOCK
-Architectures: x86
-Type: vm ioctl
-Parameters: struct kvm_clock_data (out)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_ADJUST_CLOCK
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: struct kvm_clock_data (out)
+:Returns: 0 on success, -1 on error
  
  Gets the current timestamp of kvmclock as seen by the current guest. In
  conjunction with KVM_SET_CLOCK, it is used to ensure monotonicity on scenarios
@@ -880,47 +965,56 @@ with KVM_SET_CLOCK.  KVM will try to make all VCPUs follow this clock,
  but the exact value read by each VCPU could differ, because the host
  TSC is not stable.
  
-struct kvm_clock_data {
+::
+
+  struct kvm_clock_data {
         __u64 clock;  /* kvmclock current value */
         __u32 flags;
         __u32 pad[9];
-};
+  };
  
  
  4.30 KVM_SET_CLOCK
+------------------
  
-Capability: KVM_CAP_ADJUST_CLOCK
-Architectures: x86
-Type: vm ioctl
-Parameters: struct kvm_clock_data (in)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_ADJUST_CLOCK
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: struct kvm_clock_data (in)
+:Returns: 0 on success, -1 on error
  
  Sets the current timestamp of kvmclock to the value specified in its parameter.
  In conjunction with KVM_GET_CLOCK, it is used to ensure monotonicity on scenarios
  such as migration.
  
-struct kvm_clock_data {
+::
+
+  struct kvm_clock_data {
         __u64 clock;  /* kvmclock current value */
         __u32 flags;
         __u32 pad[9];
-};
+  };
  
  
  4.31 KVM_GET_VCPU_EVENTS
+------------------------
  
-Capability: KVM_CAP_VCPU_EVENTS
-Extended by: KVM_CAP_INTR_SHADOW
-Architectures: x86, arm, arm64
-Type: vcpu ioctl
-Parameters: struct kvm_vcpu_event (out)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_VCPU_EVENTS
+:Extended by: KVM_CAP_INTR_SHADOW
+:Architectures: x86, arm, arm64
+:Type: vcpu ioctl
+:Parameters: struct kvm_vcpu_event (out)
+:Returns: 0 on success, -1 on error
  
  X86:
+^^^^
  
  Gets currently pending exceptions, interrupts, and NMIs as well as related
  states of the vcpu.
  
-struct kvm_vcpu_events {
+::
+
+  struct kvm_vcpu_events {
         struct {
                 __u8 injected;
                 __u8 nr;
@@ -951,7 +1045,7 @@ struct kvm_vcpu_events {
         __u8 reserved[27];
         __u8 exception_has_payload;
         __u64 exception_payload;
-};
+  };
  
  The following bits are defined in the flags field:
  
@@ -967,6 +1061,7 @@ The following bits are defined in the flags field:
    KVM_CAP_EXCEPTION_PAYLOAD is enabled.
  
  ARM/ARM64:
+^^^^^^^^^^
  
  If the guest accesses a device that is being emulated by the host kernel in
  such a way that a real device would generate a physical SError, KVM may make
@@ -1006,8 +1101,9 @@ It is not possible to read back a pending external abort (injected via
  KVM_SET_VCPU_EVENTS or otherwise) because such an exception is always delivered
  directly to the virtual CPU).
  
+::
  
-struct kvm_vcpu_events {
+  struct kvm_vcpu_events {
         struct {
                 __u8 serror_pending;
                 __u8 serror_has_esr;
@@ -1017,18 +1113,20 @@ struct kvm_vcpu_events {
                 __u64 serror_esr;
         } exception;
         __u32 reserved[12];
-};
+  };
  
  4.32 KVM_SET_VCPU_EVENTS
+------------------------
  
-Capability: KVM_CAP_VCPU_EVENTS
-Extended by: KVM_CAP_INTR_SHADOW
-Architectures: x86, arm, arm64
-Type: vcpu ioctl
-Parameters: struct kvm_vcpu_event (in)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_VCPU_EVENTS
+:Extended by: KVM_CAP_INTR_SHADOW
+:Architectures: x86, arm, arm64
+:Type: vcpu ioctl
+:Parameters: struct kvm_vcpu_event (in)
+:Returns: 0 on success, -1 on error
  
  X86:
+^^^^
  
  Set pending exceptions, interrupts, and NMIs as well as related states of the
  vcpu.
@@ -1040,9 +1138,11 @@ from the update. These fields are nmi.pending, sipi_vector, smi.smm,
  smi.pending. Keep the corresponding bits in the flags field cleared to
  suppress overwriting the current in-kernel state. The bits are:
  
-KVM_VCPUEVENT_VALID_NMI_PENDING - transfer nmi.pending to the kernel
-KVM_VCPUEVENT_VALID_SIPI_VECTOR - transfer sipi_vector
-KVM_VCPUEVENT_VALID_SMM         - transfer the smi sub-struct.
+===============================  ==================================
+KVM_VCPUEVENT_VALID_NMI_PENDING  transfer nmi.pending to the kernel
+KVM_VCPUEVENT_VALID_SIPI_VECTOR  transfer sipi_vector
+KVM_VCPUEVENT_VALID_SMM          transfer the smi sub-struct.
+===============================  ==================================
  
  If KVM_CAP_INTR_SHADOW is available, KVM_VCPUEVENT_VALID_SHADOW can be set in
  the flags field to signal that interrupt.shadow contains a valid state and
@@ -1056,6 +1156,7 @@ exception_has_payload, exception_payload, and exception.pending fields
  contain a valid state and shall be written into the VCPU.
  
  ARM/ARM64:
+^^^^^^^^^^
  
  User space may need to inject several types of events to the guest.
  
@@ -1078,31 +1179,35 @@ See KVM_GET_VCPU_EVENTS for the data structure.
  
  
  4.33 KVM_GET_DEBUGREGS
+----------------------
  
-Capability: KVM_CAP_DEBUGREGS
-Architectures: x86
-Type: vm ioctl
-Parameters: struct kvm_debugregs (out)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_DEBUGREGS
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: struct kvm_debugregs (out)
+:Returns: 0 on success, -1 on error
  
  Reads debug registers from the vcpu.
  
-struct kvm_debugregs {
+::
+
+  struct kvm_debugregs {
         __u64 db[4];
         __u64 dr6;
         __u64 dr7;
         __u64 flags;
         __u64 reserved[9];
-};
+  };
  
  
  4.34 KVM_SET_DEBUGREGS
+----------------------
  
-Capability: KVM_CAP_DEBUGREGS
-Architectures: x86
-Type: vm ioctl
-Parameters: struct kvm_debugregs (in)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_DEBUGREGS
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: struct kvm_debugregs (in)
+:Returns: 0 on success, -1 on error
  
  Writes debug registers into the vcpu.
  
@@ -1111,24 +1216,27 @@ yet and must be cleared on entry.
  
  
  4.35 KVM_SET_USER_MEMORY_REGION
+-------------------------------
  
-Capability: KVM_CAP_USER_MEMORY
-Architectures: all
-Type: vm ioctl
-Parameters: struct kvm_userspace_memory_region (in)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_USER_MEMORY
+:Architectures: all
+:Type: vm ioctl
+:Parameters: struct kvm_userspace_memory_region (in)
+:Returns: 0 on success, -1 on error
  
-struct kvm_userspace_memory_region {
+::
+
+  struct kvm_userspace_memory_region {
         __u32 slot;
         __u32 flags;
         __u64 guest_phys_addr;
         __u64 memory_size; /* bytes */
         __u64 userspace_addr; /* start of the userspace allocated memory */
-};
+  };
  
-/* for kvm_memory_region::flags */
-#define KVM_MEM_LOG_DIRTY_PAGES        (1UL << 0)
-#define KVM_MEM_READONLY       (1UL << 1)
+  /* for kvm_memory_region::flags */
+  #define KVM_MEM_LOG_DIRTY_PAGES      (1UL << 0)
+  #define KVM_MEM_READONLY     (1UL << 1)
  
  This ioctl allows the user to create, modify or delete a guest physical
  memory slot.  Bits 0-15 of "slot" specify the slot id and this value
@@ -1174,12 +1282,13 @@ allocation and is deprecated.
  
  
  4.36 KVM_SET_TSS_ADDR
+---------------------
  
-Capability: KVM_CAP_SET_TSS_ADDR
-Architectures: x86
-Type: vm ioctl
-Parameters: unsigned long tss_address (in)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_SET_TSS_ADDR
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: unsigned long tss_address (in)
+:Returns: 0 on success, -1 on error
  
  This ioctl defines the physical address of a three-page region in the guest
  physical address space.  The region must be within the first 4GB of the
@@ -1193,21 +1302,24 @@ documentation when it pops into existence).
  
  
  4.37 KVM_ENABLE_CAP
+-------------------
+
+:Capability: KVM_CAP_ENABLE_CAP
+:Architectures: mips, ppc, s390
+:Type: vcpu ioctl
+:Parameters: struct kvm_enable_cap (in)
+:Returns: 0 on success; -1 on error
  
-Capability: KVM_CAP_ENABLE_CAP
-Architectures: mips, ppc, s390
-Type: vcpu ioctl
-Parameters: struct kvm_enable_cap (in)
-Returns: 0 on success; -1 on error
+:Capability: KVM_CAP_ENABLE_CAP_VM
+:Architectures: all
+:Type: vcpu ioctl
+:Parameters: struct kvm_enable_cap (in)
+:Returns: 0 on success; -1 on error
  
-Capability: KVM_CAP_ENABLE_CAP_VM
-Architectures: all
-Type: vcpu ioctl
-Parameters: struct kvm_enable_cap (in)
-Returns: 0 on success; -1 on error
+.. note::
  
-+Not all extensions are enabled by default. Using this ioctl the application
-can enable an extension, making it available to the guest.
+   Not all extensions are enabled by default. Using this ioctl the application
+   can enable an extension, making it available to the guest.
  
  On systems that do not support this ioctl, it always fails. On systems that
  do support it, it only works for extensions that are supported for enablement.
@@ -1215,76 +1327,91 @@ do support it, it only works for extensions that are supported for enablement.
  To check if a capability can be enabled, the KVM_CHECK_EXTENSION ioctl should
  be used.
  
-struct kvm_enable_cap {
+::
+
+  struct kvm_enable_cap {
         /* in */
         __u32 cap;
  
  The capability that is supposed to get enabled.
  
+::
+
         __u32 flags;
  
  A bitfield indicating future enhancements. Has to be 0 for now.
  
+::
+
         __u64 args[4];
  
  Arguments for enabling a feature. If a feature needs initial values to
  function properly, this is the place to put them.
  
+::
+
         __u8  pad[64];
-};
+  };
  
  The vcpu ioctl should be used for vcpu-specific capabilities, the vm ioctl
  for vm-wide capabilities.
  
  4.38 KVM_GET_MP_STATE
+---------------------
+
+:Capability: KVM_CAP_MP_STATE
+:Architectures: x86, s390, arm, arm64
+:Type: vcpu ioctl
+:Parameters: struct kvm_mp_state (out)
+:Returns: 0 on success; -1 on error
  
-Capability: KVM_CAP_MP_STATE
-Architectures: x86, s390, arm, arm64
-Type: vcpu ioctl
-Parameters: struct kvm_mp_state (out)
-Returns: 0 on success; -1 on error
+::
  
-struct kvm_mp_state {
+  struct kvm_mp_state {
         __u32 mp_state;
-};
+  };
  
  Returns the vcpu's current "multiprocessing state" (though also valid on
  uniprocessor guests).
  
  Possible values are:
  
- - KVM_MP_STATE_RUNNABLE:        the vcpu is currently running [x86,arm/arm64]
- - KVM_MP_STATE_UNINITIALIZED:   the vcpu is an application processor (AP)
+   ==========================    ===============================================
+   KVM_MP_STATE_RUNNABLE         the vcpu is currently running [x86,arm/arm64]
+   KVM_MP_STATE_UNINITIALIZED    the vcpu is an application processor (AP)
                                   which has not yet received an INIT signal [x86]
- - KVM_MP_STATE_INIT_RECEIVED:   the vcpu has received an INIT signal, and is
+   KVM_MP_STATE_INIT_RECEIVED    the vcpu has received an INIT signal, and is
                                   now ready for a SIPI [x86]
- - KVM_MP_STATE_HALTED:          the vcpu has executed a HLT instruction and
+   KVM_MP_STATE_HALTED           the vcpu has executed a HLT instruction and
                                   is waiting for an interrupt [x86]
- - KVM_MP_STATE_SIPI_RECEIVED:   the vcpu has just received a SIPI (vector
+   KVM_MP_STATE_SIPI_RECEIVED    the vcpu has just received a SIPI (vector
                                   accessible via KVM_GET_VCPU_EVENTS) [x86]
- - KVM_MP_STATE_STOPPED:         the vcpu is stopped [s390,arm/arm64]
- - KVM_MP_STATE_CHECK_STOP:      the vcpu is in a special error state [s390]
- - KVM_MP_STATE_OPERATING:       the vcpu is operating (running or halted)
+   KVM_MP_STATE_STOPPED          the vcpu is stopped [s390,arm/arm64]
+   KVM_MP_STATE_CHECK_STOP       the vcpu is in a special error state [s390]
+   KVM_MP_STATE_OPERATING        the vcpu is operating (running or halted)
                                   [s390]
- - KVM_MP_STATE_LOAD:            the vcpu is in a special load/startup state
+   KVM_MP_STATE_LOAD             the vcpu is in a special load/startup state
                                   [s390]
+   ==========================    ===============================================
  
  On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an
  in-kernel irqchip, the multiprocessing state must be maintained by userspace on
  these architectures.
  
  For arm/arm64:
+^^^^^^^^^^^^^^
  
  The only states that are valid are KVM_MP_STATE_STOPPED and
  KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not.
  
  4.39 KVM_SET_MP_STATE
+---------------------
  
-Capability: KVM_CAP_MP_STATE
-Architectures: x86, s390, arm, arm64
-Type: vcpu ioctl
-Parameters: struct kvm_mp_state (in)
-Returns: 0 on success; -1 on error
+:Capability: KVM_CAP_MP_STATE
+:Architectures: x86, s390, arm, arm64
+:Type: vcpu ioctl
+:Parameters: struct kvm_mp_state (in)
+:Returns: 0 on success; -1 on error
  
  Sets the vcpu's current "multiprocessing state"; see KVM_GET_MP_STATE for
  arguments.
@@ -1294,17 +1421,19 @@ in-kernel irqchip, the multiprocessing state must be maintained by userspace on
  these architectures.
  
  For arm/arm64:
+^^^^^^^^^^^^^^
  
  The only states that are valid are KVM_MP_STATE_STOPPED and
  KVM_MP_STATE_RUNNABLE which reflect if the vcpu should be paused or not.
  
  4.40 KVM_SET_IDENTITY_MAP_ADDR
+------------------------------
  
-Capability: KVM_CAP_SET_IDENTITY_MAP_ADDR
-Architectures: x86
-Type: vm ioctl
-Parameters: unsigned long identity (in)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_SET_IDENTITY_MAP_ADDR
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: unsigned long identity (in)
+:Returns: 0 on success, -1 on error
  
  This ioctl defines the physical address of a one-page region in the guest
  physical address space.  The region must be within the first 4GB of the
@@ -1322,12 +1451,13 @@ documentation when it pops into existence).
  Fails if any VCPU has already been created.
  
  4.41 KVM_SET_BOOT_CPU_ID
+------------------------
  
-Capability: KVM_CAP_SET_BOOT_CPU_ID
-Architectures: x86
-Type: vm ioctl
-Parameters: unsigned long vcpu_id
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_SET_BOOT_CPU_ID
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: unsigned long vcpu_id
+:Returns: 0 on success, -1 on error
  
  Define which vcpu is the Bootstrap Processor (BSP).  Values are the same
  as the vcpu id in KVM_CREATE_VCPU.  If this ioctl is not called, the default
@@ -1335,102 +1465,119 @@ is vcpu 0.
  
  
  4.42 KVM_GET_XSAVE
+------------------
  
-Capability: KVM_CAP_XSAVE
-Architectures: x86
-Type: vcpu ioctl
-Parameters: struct kvm_xsave (out)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_XSAVE
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_xsave (out)
+:Returns: 0 on success, -1 on error
  
-struct kvm_xsave {
+
+::
+
+  struct kvm_xsave {
         __u32 region[1024];
-};
+  };
  
  This ioctl would copy current vcpu's xsave struct to the userspace.
  
  
  4.43 KVM_SET_XSAVE
+------------------
+
+:Capability: KVM_CAP_XSAVE
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_xsave (in)
+:Returns: 0 on success, -1 on error
  
-Capability: KVM_CAP_XSAVE
-Architectures: x86
-Type: vcpu ioctl
-Parameters: struct kvm_xsave (in)
-Returns: 0 on success, -1 on error
+::
  
-struct kvm_xsave {
+
+  struct kvm_xsave {
         __u32 region[1024];
-};
+  };
  
  This ioctl would copy userspace's xsave struct to the kernel.
  
  
  4.44 KVM_GET_XCRS
+-----------------
+
+:Capability: KVM_CAP_XCRS
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_xcrs (out)
+:Returns: 0 on success, -1 on error
  
-Capability: KVM_CAP_XCRS
-Architectures: x86
-Type: vcpu ioctl
-Parameters: struct kvm_xcrs (out)
-Returns: 0 on success, -1 on error
+::
  
-struct kvm_xcr {
+  struct kvm_xcr {
         __u32 xcr;
         __u32 reserved;
         __u64 value;
-};
+  };
  
-struct kvm_xcrs {
+  struct kvm_xcrs {
         __u32 nr_xcrs;
         __u32 flags;
         struct kvm_xcr xcrs[KVM_MAX_XCRS];
         __u64 padding[16];
-};
+  };
  
  This ioctl would copy current vcpu's xcrs to the userspace.
  
  
  4.45 KVM_SET_XCRS
+-----------------
+
+:Capability: KVM_CAP_XCRS
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_xcrs (in)
+:Returns: 0 on success, -1 on error
  
-Capability: KVM_CAP_XCRS
-Architectures: x86
-Type: vcpu ioctl
-Parameters: struct kvm_xcrs (in)
-Returns: 0 on success, -1 on error
+::
  
-struct kvm_xcr {
+  struct kvm_xcr {
         __u32 xcr;
         __u32 reserved;
         __u64 value;
-};
+  };
  
-struct kvm_xcrs {
+  struct kvm_xcrs {
         __u32 nr_xcrs;
         __u32 flags;
         struct kvm_xcr xcrs[KVM_MAX_XCRS];
         __u64 padding[16];
-};
+  };
  
  This ioctl would set vcpu's xcr to the value userspace specified.
  
  
  4.46 KVM_GET_SUPPORTED_CPUID
+----------------------------
  
-Capability: KVM_CAP_EXT_CPUID
-Architectures: x86
-Type: system ioctl
-Parameters: struct kvm_cpuid2 (in/out)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_EXT_CPUID
+:Architectures: x86
+:Type: system ioctl
+:Parameters: struct kvm_cpuid2 (in/out)
+:Returns: 0 on success, -1 on error
  
-struct kvm_cpuid2 {
+::
+
+  struct kvm_cpuid2 {
         __u32 nent;
         __u32 padding;
         struct kvm_cpuid_entry2 entries[0];
-};
+  };
  
-#define KVM_CPUID_FLAG_SIGNIFCANT_INDEX                BIT(0)
-#define KVM_CPUID_FLAG_STATEFUL_FUNC           BIT(1)
-#define KVM_CPUID_FLAG_STATE_READ_NEXT         BIT(2)
+  #define KVM_CPUID_FLAG_SIGNIFCANT_INDEX              BIT(0)
+  #define KVM_CPUID_FLAG_STATEFUL_FUNC         BIT(1)
+  #define KVM_CPUID_FLAG_STATE_READ_NEXT               BIT(2)
  
-struct kvm_cpuid_entry2 {
+  struct kvm_cpuid_entry2 {
         __u32 function;
         __u32 index;
         __u32 flags;
@@ -1439,7 +1586,7 @@ struct kvm_cpuid_entry2 {
         __u32 ecx;
         __u32 edx;
         __u32 padding[3];
-};
+  };
  
  This ioctl returns x86 cpuid features which are supported by both the
  hardware and kvm in its default configuration.  Userspace can use the
@@ -1467,10 +1614,16 @@ with unknown or unsupported features masked out.  Some features (for example,
  x2apic), may not be present in the host cpu, but are exposed by kvm if it can
  emulate them efficiently. The fields in each entry are defined as follows:
  
-  function: the eax value used to obtain the entry
-  index: the ecx value used to obtain the entry (for entries that are
+  function:
+         the eax value used to obtain the entry
+
+  index:
+         the ecx value used to obtain the entry (for entries that are
           affected by ecx)
-  flags: an OR of zero or more of the following:
+
+  flags:
+     an OR of zero or more of the following:
+
          KVM_CPUID_FLAG_SIGNIFCANT_INDEX:
             if the index field is valid
          KVM_CPUID_FLAG_STATEFUL_FUNC:
@@ -1480,12 +1633,14 @@ emulate them efficiently. The fields in each entry are defined as follows:
          KVM_CPUID_FLAG_STATE_READ_NEXT:
             for KVM_CPUID_FLAG_STATEFUL_FUNC entries, set if this entry is
             the first entry to be read by a cpu
-   eax, ebx, ecx, edx: the values returned by the cpuid instruction for
+
+   eax, ebx, ecx, edx:
+         the values returned by the cpuid instruction for
           this function/index combination
  
  The TSC deadline timer feature (CPUID leaf 1, ecx[24]) is always returned
  as false, since the feature depends on KVM_CREATE_IRQCHIP for local APIC
-support.  Instead it is reported via
+support.  Instead it is reported via::
  
    ioctl(KVM_CHECK_EXTENSION, KVM_CAP_TSC_DEADLINE_TIMER)
  
@@ -1494,18 +1649,21 @@ feature in userspace, then you can enable the feature for KVM_SET_CPUID2.
  
  
  4.47 KVM_PPC_GET_PVINFO
+-----------------------
  
-Capability: KVM_CAP_PPC_GET_PVINFO
-Architectures: ppc
-Type: vm ioctl
-Parameters: struct kvm_ppc_pvinfo (out)
-Returns: 0 on success, !0 on error
+:Capability: KVM_CAP_PPC_GET_PVINFO
+:Architectures: ppc
+:Type: vm ioctl
+:Parameters: struct kvm_ppc_pvinfo (out)
+:Returns: 0 on success, !0 on error
  
-struct kvm_ppc_pvinfo {
+::
+
+  struct kvm_ppc_pvinfo {
         __u32 flags;
         __u32 hcall[4];
         __u8  pad[108];
-};
+  };
  
  This ioctl fetches PV specific information that need to be passed to the guest
  using the device tree or other means from vm context.
@@ -1515,33 +1673,39 @@ The hcall array defines 4 instructions that make up a hypercall.
  If any additional field gets added to this structure later on, a bit for that
  additional piece of information will be set in the flags bitmap.
  
-The flags bitmap is defined as:
+The flags bitmap is defined as::
  
     /* the host supports the ePAPR idle hcall
     #define KVM_PPC_PVINFO_FLAGS_EV_IDLE   (1<<0)
  
  4.52 KVM_SET_GSI_ROUTING
+------------------------
  
-Capability: KVM_CAP_IRQ_ROUTING
-Architectures: x86 s390 arm arm64
-Type: vm ioctl
-Parameters: struct kvm_irq_routing (in)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_IRQ_ROUTING
+:Architectures: x86 s390 arm arm64
+:Type: vm ioctl
+:Parameters: struct kvm_irq_routing (in)
+:Returns: 0 on success, -1 on error
  
  Sets the GSI routing table entries, overwriting any previously set entries.
  
  On arm/arm64, GSI routing has the following limitation:
+
  - GSI routing does not apply to KVM_IRQ_LINE but only to KVM_IRQFD.
  
-struct kvm_irq_routing {
+::
+
+  struct kvm_irq_routing {
         __u32 nr;
         __u32 flags;
         struct kvm_irq_routing_entry entries[0];
-};
+  };
  
  No flags are specified so far, the corresponding field must be set to zero.
  
-struct kvm_irq_routing_entry {
+::
+
+  struct kvm_irq_routing_entry {
         __u32 gsi;
         __u32 type;
         __u32 flags;
@@ -1553,15 +1717,16 @@ struct kvm_irq_routing_entry {
                 struct kvm_irq_routing_hv_sint hv_sint;
                 __u32 pad[8];
         } u;
-};
+  };
  
-/* gsi routing entry types */
-#define KVM_IRQ_ROUTING_IRQCHIP 1
-#define KVM_IRQ_ROUTING_MSI 2
-#define KVM_IRQ_ROUTING_S390_ADAPTER 3
-#define KVM_IRQ_ROUTING_HV_SINT 4
+  /* gsi routing entry types */
+  #define KVM_IRQ_ROUTING_IRQCHIP 1
+  #define KVM_IRQ_ROUTING_MSI 2
+  #define KVM_IRQ_ROUTING_S390_ADAPTER 3
+  #define KVM_IRQ_ROUTING_HV_SINT 4
  
  flags:
+
  - KVM_MSI_VALID_DEVID: used along with KVM_IRQ_ROUTING_MSI routing entry
    type, specifies that the devid field contains a valid value.  The per-VM
    KVM_CAP_MSI_DEVID capability advertises the requirement to provide
@@ -1569,12 +1734,14 @@ flags:
    never set the KVM_MSI_VALID_DEVID flag as the ioctl might fail.
  - zero otherwise
  
-struct kvm_irq_routing_irqchip {
+::
+
+  struct kvm_irq_routing_irqchip {
         __u32 irqchip;
         __u32 pin;
-};
+  };
  
-struct kvm_irq_routing_msi {
+  struct kvm_irq_routing_msi {
         __u32 address_lo;
         __u32 address_hi;
         __u32 data;
@@ -1582,7 +1749,7 @@ struct kvm_irq_routing_msi {
                 __u32 pad;
                 __u32 devid;
         };
-};
+  };
  
  If KVM_MSI_VALID_DEVID is set, devid contains a unique device identifier
  for the device that wrote the MSI message.  For PCI, this is usually a
@@ -1593,39 +1760,43 @@ feature of KVM_CAP_X2APIC_API capability is enabled.  If it is enabled,
  address_hi bits 31-8 provide bits 31-8 of the destination id.  Bits 7-0 of
  address_hi must be zero.
  
-struct kvm_irq_routing_s390_adapter {
+::
+
+  struct kvm_irq_routing_s390_adapter {
         __u64 ind_addr;
         __u64 summary_addr;
         __u64 ind_offset;
         __u32 summary_offset;
         __u32 adapter_id;
-};
+  };
  
-struct kvm_irq_routing_hv_sint {
+  struct kvm_irq_routing_hv_sint {
         __u32 vcpu;
         __u32 sint;
-};
+  };
  
  
  4.55 KVM_SET_TSC_KHZ
+--------------------
  
-Capability: KVM_CAP_TSC_CONTROL
-Architectures: x86
-Type: vcpu ioctl
-Parameters: virtual tsc_khz
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_TSC_CONTROL
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: virtual tsc_khz
+:Returns: 0 on success, -1 on error
  
  Specifies the tsc frequency for the virtual machine. The unit of the
  frequency is KHz.
  
  
  4.56 KVM_GET_TSC_KHZ
+--------------------
  
-Capability: KVM_CAP_GET_TSC_KHZ
-Architectures: x86
-Type: vcpu ioctl
-Parameters: none
-Returns: virtual tsc-khz on success, negative value on error
+:Capability: KVM_CAP_GET_TSC_KHZ
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: none
+:Returns: virtual tsc-khz on success, negative value on error
  
  Returns the tsc frequency of the guest. The unit of the return value is
  KHz. If the host has unstable tsc this ioctl returns -EIO instead as an
@@ -1633,17 +1804,20 @@ error.
  
  
  4.57 KVM_GET_LAPIC
+------------------
  
-Capability: KVM_CAP_IRQCHIP
-Architectures: x86
-Type: vcpu ioctl
-Parameters: struct kvm_lapic_state (out)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_IRQCHIP
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_lapic_state (out)
+:Returns: 0 on success, -1 on error
  
-#define KVM_APIC_REG_SIZE 0x400
-struct kvm_lapic_state {
+::
+
+  #define KVM_APIC_REG_SIZE 0x400
+  struct kvm_lapic_state {
         char regs[KVM_APIC_REG_SIZE];
-};
+  };
  
  Reads the Local APIC registers and copies them into the input argument.  The
  data format and layout are the same as documented in the architecture manual.
@@ -1661,17 +1835,20 @@ always uses xAPIC format.
  
  
  4.58 KVM_SET_LAPIC
+------------------
  
-Capability: KVM_CAP_IRQCHIP
-Architectures: x86
-Type: vcpu ioctl
-Parameters: struct kvm_lapic_state (in)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_IRQCHIP
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_lapic_state (in)
+:Returns: 0 on success, -1 on error
  
-#define KVM_APIC_REG_SIZE 0x400
-struct kvm_lapic_state {
+::
+
+  #define KVM_APIC_REG_SIZE 0x400
+  struct kvm_lapic_state {
         char regs[KVM_APIC_REG_SIZE];
-};
+  };
  
  Copies the input argument into the Local APIC registers.  The data format
  and layout are the same as documented in the architecture manual.
@@ -1682,35 +1859,38 @@ See the note in KVM_GET_LAPIC.
  
  
  4.59 KVM_IOEVENTFD
+------------------
  
-Capability: KVM_CAP_IOEVENTFD
-Architectures: all
-Type: vm ioctl
-Parameters: struct kvm_ioeventfd (in)
-Returns: 0 on success, !0 on error
+:Capability: KVM_CAP_IOEVENTFD
+:Architectures: all
+:Type: vm ioctl
+:Parameters: struct kvm_ioeventfd (in)
+:Returns: 0 on success, !0 on error
  
  This ioctl attaches or detaches an ioeventfd to a legal pio/mmio address
  within the guest.  A guest write in the registered address will signal the
  provided event instead of triggering an exit.
  
-struct kvm_ioeventfd {
+::
+
+  struct kvm_ioeventfd {
         __u64 datamatch;
         __u64 addr;        /* legal pio/mmio address */
         __u32 len;         /* 0, 1, 2, 4, or 8 bytes    */
         __s32 fd;
         __u32 flags;
         __u8  pad[36];
-};
+  };
  
  For the special case of virtio-ccw devices on s390, the ioevent is matched
  to a subchannel/virtqueue tuple instead.
  
-The following flags are defined:
+The following flags are defined::
  
-#define KVM_IOEVENTFD_FLAG_DATAMATCH (1 << kvm_ioeventfd_flag_nr_datamatch)
-#define KVM_IOEVENTFD_FLAG_PIO       (1 << kvm_ioeventfd_flag_nr_pio)
-#define KVM_IOEVENTFD_FLAG_DEASSIGN  (1 << kvm_ioeventfd_flag_nr_deassign)
-#define KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY \
+  #define KVM_IOEVENTFD_FLAG_DATAMATCH (1 << kvm_ioeventfd_flag_nr_datamatch)
+  #define KVM_IOEVENTFD_FLAG_PIO       (1 << kvm_ioeventfd_flag_nr_pio)
+  #define KVM_IOEVENTFD_FLAG_DEASSIGN  (1 << kvm_ioeventfd_flag_nr_deassign)
+  #define KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY \
         (1 << kvm_ioeventfd_flag_nr_virtio_ccw_notify)
  
  If datamatch flag is set, the event will be signaled only if the written value
@@ -1725,17 +1905,20 @@ The speedup may only apply to specific architectures, but the ioeventfd will
  work anyway.
  
  4.60 KVM_DIRTY_TLB
+------------------
+
+:Capability: KVM_CAP_SW_TLB
+:Architectures: ppc
+:Type: vcpu ioctl
+:Parameters: struct kvm_dirty_tlb (in)
+:Returns: 0 on success, -1 on error
  
-Capability: KVM_CAP_SW_TLB
-Architectures: ppc
-Type: vcpu ioctl
-Parameters: struct kvm_dirty_tlb (in)
-Returns: 0 on success, -1 on error
+::
  
-struct kvm_dirty_tlb {
+  struct kvm_dirty_tlb {
         __u64 bitmap;
         __u32 num_dirty;
-};
+  };
  
  This must be called whenever userspace has changed an entry in the shared
  TLB, prior to calling KVM_RUN on the associated vcpu.
@@ -1758,23 +1941,26 @@ be set to the number of set bits in the bitmap.
  
  
  4.62 KVM_CREATE_SPAPR_TCE
+-------------------------
  
-Capability: KVM_CAP_SPAPR_TCE
-Architectures: powerpc
-Type: vm ioctl
-Parameters: struct kvm_create_spapr_tce (in)
-Returns: file descriptor for manipulating the created TCE table
+:Capability: KVM_CAP_SPAPR_TCE
+:Architectures: powerpc
+:Type: vm ioctl
+:Parameters: struct kvm_create_spapr_tce (in)
+:Returns: file descriptor for manipulating the created TCE table
  
  This creates a virtual TCE (translation control entry) table, which
  is an IOMMU for PAPR-style virtual I/O.  It is used to translate
  logical addresses used in virtual I/O into guest physical addresses,
  and provides a scatter/gather capability for PAPR virtual I/O.
  
-/* for KVM_CAP_SPAPR_TCE */
-struct kvm_create_spapr_tce {
+::
+
+  /* for KVM_CAP_SPAPR_TCE */
+  struct kvm_create_spapr_tce {
         __u64 liobn;
         __u32 window_size;
-};
+  };
  
  The liobn field gives the logical IO bus number for which to create a
  TCE table.  The window_size field specifies the size of the DMA window
@@ -1794,12 +1980,13 @@ circumstances.
  
  
  4.63 KVM_ALLOCATE_RMA
+---------------------
  
-Capability: KVM_CAP_PPC_RMA
-Architectures: powerpc
-Type: vm ioctl
-Parameters: struct kvm_allocate_rma (out)
-Returns: file descriptor for mapping the allocated RMA
+:Capability: KVM_CAP_PPC_RMA
+:Architectures: powerpc
+:Type: vm ioctl
+:Parameters: struct kvm_allocate_rma (out)
+:Returns: file descriptor for mapping the allocated RMA
  
  This allocates a Real Mode Area (RMA) from the pool allocated at boot
  time by the kernel.  An RMA is a physically-contiguous, aligned region
@@ -1808,10 +1995,12 @@ will be accessed by real-mode (MMU off) accesses in a KVM guest.
  POWER processors support a set of sizes for the RMA that usually
  includes 64MB, 128MB, 256MB and some larger powers of two.
  
-/* for KVM_ALLOCATE_RMA */
-struct kvm_allocate_rma {
+::
+
+  /* for KVM_ALLOCATE_RMA */
+  struct kvm_allocate_rma {
         __u64 rma_size;
-};
+  };
  
  The return value is a file descriptor which can be passed to mmap(2)
  to map the allocated RMA into userspace.  The mapped area can then be
@@ -1827,12 +2016,13 @@ because it supports the Virtual RMA (VRMA) facility.
  
  
  4.64 KVM_NMI
+------------
  
-Capability: KVM_CAP_USER_NMI
-Architectures: x86
-Type: vcpu ioctl
-Parameters: none
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_USER_NMI
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: none
+:Returns: 0 on success, -1 on error
  
  Queues an NMI on the thread's vcpu.  Note this is well defined only
  when KVM_CREATE_IRQCHIP has not been called, since this is an interface
@@ -1853,14 +2043,16 @@ debugging.
  
  
  4.65 KVM_S390_UCAS_MAP
+----------------------
  
-Capability: KVM_CAP_S390_UCONTROL
-Architectures: s390
-Type: vcpu ioctl
-Parameters: struct kvm_s390_ucas_mapping (in)
-Returns: 0 in case of success
+:Capability: KVM_CAP_S390_UCONTROL
+:Architectures: s390
+:Type: vcpu ioctl
+:Parameters: struct kvm_s390_ucas_mapping (in)
+:Returns: 0 in case of success
+
+The parameter is defined like this::
  
-The parameter is defined like this:
         struct kvm_s390_ucas_mapping {
                 __u64 user_addr;
                 __u64 vcpu_addr;
@@ -1873,14 +2065,16 @@ be aligned by 1 megabyte.
  
  
  4.66 KVM_S390_UCAS_UNMAP
+------------------------
  
-Capability: KVM_CAP_S390_UCONTROL
-Architectures: s390
-Type: vcpu ioctl
-Parameters: struct kvm_s390_ucas_mapping (in)
-Returns: 0 in case of success
+:Capability: KVM_CAP_S390_UCONTROL
+:Architectures: s390
+:Type: vcpu ioctl
+:Parameters: struct kvm_s390_ucas_mapping (in)
+:Returns: 0 in case of success
+
+The parameter is defined like this::
  
-The parameter is defined like this:
         struct kvm_s390_ucas_mapping {
                 __u64 user_addr;
                 __u64 vcpu_addr;
@@ -1893,12 +2087,13 @@ All parameters need to be aligned by 1 megabyte.
  
  
  4.67 KVM_S390_VCPU_FAULT
+------------------------
  
-Capability: KVM_CAP_S390_UCONTROL
-Architectures: s390
-Type: vcpu ioctl
-Parameters: vcpu absolute address (in)
-Returns: 0 in case of success
+:Capability: KVM_CAP_S390_UCONTROL
+:Architectures: s390
+:Type: vcpu ioctl
+:Parameters: vcpu absolute address (in)
+:Returns: 0 in case of success
  
  This call creates a page table entry on the virtual cpu's address space
  (for user controlled virtual machines) or the virtual machine's address
@@ -1910,23 +2105,31 @@ prior to calling the KVM_RUN ioctl.
  
  
  4.68 KVM_SET_ONE_REG
+--------------------
+
+:Capability: KVM_CAP_ONE_REG
+:Architectures: all
+:Type: vcpu ioctl
+:Parameters: struct kvm_one_reg (in)
+:Returns: 0 on success, negative value on failure
  
-Capability: KVM_CAP_ONE_REG
-Architectures: all
-Type: vcpu ioctl
-Parameters: struct kvm_one_reg (in)
-Returns: 0 on success, negative value on failure
  Errors:
-  ENOENT:   no such register
-  EINVAL:   invalid register ID, or no such register
-  EPERM:    (arm64) register access not allowed before vcpu finalization
+
+  ======   ============================================================
+  ENOENT   no such register
+  EINVAL   invalid register ID, or no such register
+  EPERM    (arm64) register access not allowed before vcpu finalization
+  ======   ============================================================
+
  (These error codes are indicative only: do not rely on a specific error
  code being returned in a specific situation.)
  
-struct kvm_one_reg {
+::
+
+  struct kvm_one_reg {
         __u64 id;
         __u64 addr;
-};
+ };
  
  Using this ioctl, a single vcpu register can be set to a specific value
  defined by user space with the passed in struct kvm_one_reg, where id
@@ -1936,217 +2139,226 @@ and architecture specific registers. Each have their own range of operation
  and their own constants and width. To keep track of the implemented
  registers, find a list below:
  
-  Arch  |           Register            | Width (bits)
-        |                               |
-  PPC   | KVM_REG_PPC_HIOR              | 64
-  PPC   | KVM_REG_PPC_IAC1              | 64
-  PPC   | KVM_REG_PPC_IAC2              | 64
-  PPC   | KVM_REG_PPC_IAC3              | 64
-  PPC   | KVM_REG_PPC_IAC4              | 64
-  PPC   | KVM_REG_PPC_DAC1              | 64
-  PPC   | KVM_REG_PPC_DAC2              | 64
-  PPC   | KVM_REG_PPC_DABR              | 64
-  PPC   | KVM_REG_PPC_DSCR              | 64
-  PPC   | KVM_REG_PPC_PURR              | 64
-  PPC   | KVM_REG_PPC_SPURR             | 64
-  PPC   | KVM_REG_PPC_DAR               | 64
-  PPC   | KVM_REG_PPC_DSISR             | 32
-  PPC   | KVM_REG_PPC_AMR               | 64
-  PPC   | KVM_REG_PPC_UAMOR             | 64
-  PPC   | KVM_REG_PPC_MMCR0             | 64
-  PPC   | KVM_REG_PPC_MMCR1             | 64
-  PPC   | KVM_REG_PPC_MMCRA             | 64
-  PPC   | KVM_REG_PPC_MMCR2             | 64
-  PPC   | KVM_REG_PPC_MMCRS             | 64
-  PPC   | KVM_REG_PPC_SIAR              | 64
-  PPC   | KVM_REG_PPC_SDAR              | 64
-  PPC   | KVM_REG_PPC_SIER              | 64
-  PPC   | KVM_REG_PPC_PMC1              | 32
-  PPC   | KVM_REG_PPC_PMC2              | 32
-  PPC   | KVM_REG_PPC_PMC3              | 32
-  PPC   | KVM_REG_PPC_PMC4              | 32
-  PPC   | KVM_REG_PPC_PMC5              | 32
-  PPC   | KVM_REG_PPC_PMC6              | 32
-  PPC   | KVM_REG_PPC_PMC7              | 32
-  PPC   | KVM_REG_PPC_PMC8              | 32
-  PPC   | KVM_REG_PPC_FPR0              | 64
-          ...
-  PPC   | KVM_REG_PPC_FPR31             | 64
-  PPC   | KVM_REG_PPC_VR0               | 128
-          ...
-  PPC   | KVM_REG_PPC_VR31              | 128
-  PPC   | KVM_REG_PPC_VSR0              | 128
-          ...
-  PPC   | KVM_REG_PPC_VSR31             | 128
-  PPC   | KVM_REG_PPC_FPSCR             | 64
-  PPC   | KVM_REG_PPC_VSCR              | 32
-  PPC   | KVM_REG_PPC_VPA_ADDR          | 64
-  PPC   | KVM_REG_PPC_VPA_SLB           | 128
-  PPC   | KVM_REG_PPC_VPA_DTL           | 128
-  PPC   | KVM_REG_PPC_EPCR              | 32
-  PPC   | KVM_REG_PPC_EPR               | 32
-  PPC   | KVM_REG_PPC_TCR               | 32
-  PPC   | KVM_REG_PPC_TSR               | 32
-  PPC   | KVM_REG_PPC_OR_TSR            | 32
-  PPC   | KVM_REG_PPC_CLEAR_TSR         | 32
-  PPC   | KVM_REG_PPC_MAS0              | 32
-  PPC   | KVM_REG_PPC_MAS1              | 32
-  PPC   | KVM_REG_PPC_MAS2              | 64
-  PPC   | KVM_REG_PPC_MAS7_3            | 64
-  PPC   | KVM_REG_PPC_MAS4              | 32
-  PPC   | KVM_REG_PPC_MAS6              | 32
-  PPC   | KVM_REG_PPC_MMUCFG            | 32
-  PPC   | KVM_REG_PPC_TLB0CFG           | 32
-  PPC   | KVM_REG_PPC_TLB1CFG           | 32
-  PPC   | KVM_REG_PPC_TLB2CFG           | 32
-  PPC   | KVM_REG_PPC_TLB3CFG           | 32
-  PPC   | KVM_REG_PPC_TLB0PS            | 32
-  PPC   | KVM_REG_PPC_TLB1PS            | 32
-  PPC   | KVM_REG_PPC_TLB2PS            | 32
-  PPC   | KVM_REG_PPC_TLB3PS            | 32
-  PPC   | KVM_REG_PPC_EPTCFG            | 32
-  PPC   | KVM_REG_PPC_ICP_STATE         | 64
-  PPC   | KVM_REG_PPC_VP_STATE          | 128
-  PPC   | KVM_REG_PPC_TB_OFFSET         | 64
-  PPC   | KVM_REG_PPC_SPMC1             | 32
-  PPC   | KVM_REG_PPC_SPMC2             | 32
-  PPC   | KVM_REG_PPC_IAMR              | 64
-  PPC   | KVM_REG_PPC_TFHAR             | 64
-  PPC   | KVM_REG_PPC_TFIAR             | 64
-  PPC   | KVM_REG_PPC_TEXASR            | 64
-  PPC   | KVM_REG_PPC_FSCR              | 64
-  PPC   | KVM_REG_PPC_PSPB              | 32
-  PPC   | KVM_REG_PPC_EBBHR             | 64
-  PPC   | KVM_REG_PPC_EBBRR             | 64
-  PPC   | KVM_REG_PPC_BESCR             | 64
-  PPC   | KVM_REG_PPC_TAR               | 64
-  PPC   | KVM_REG_PPC_DPDES             | 64
-  PPC   | KVM_REG_PPC_DAWR              | 64
-  PPC   | KVM_REG_PPC_DAWRX             | 64
-  PPC   | KVM_REG_PPC_CIABR             | 64
-  PPC   | KVM_REG_PPC_IC                | 64
-  PPC   | KVM_REG_PPC_VTB               | 64
-  PPC   | KVM_REG_PPC_CSIGR             | 64
-  PPC   | KVM_REG_PPC_TACR              | 64
-  PPC   | KVM_REG_PPC_TCSCR             | 64
-  PPC   | KVM_REG_PPC_PID               | 64
-  PPC   | KVM_REG_PPC_ACOP              | 64
-  PPC   | KVM_REG_PPC_VRSAVE            | 32
-  PPC   | KVM_REG_PPC_LPCR              | 32
-  PPC   | KVM_REG_PPC_LPCR_64           | 64
-  PPC   | KVM_REG_PPC_PPR               | 64
-  PPC   | KVM_REG_PPC_ARCH_COMPAT       | 32
-  PPC   | KVM_REG_PPC_DABRX             | 32
-  PPC   | KVM_REG_PPC_WORT              | 64
-  PPC  | KVM_REG_PPC_SPRG9             | 64
-  PPC  | KVM_REG_PPC_DBSR              | 32
-  PPC   | KVM_REG_PPC_TIDR              | 64
-  PPC   | KVM_REG_PPC_PSSCR             | 64
-  PPC   | KVM_REG_PPC_DEC_EXPIRY        | 64
-  PPC   | KVM_REG_PPC_PTCR              | 64
-  PPC   | KVM_REG_PPC_TM_GPR0           | 64
-          ...
-  PPC   | KVM_REG_PPC_TM_GPR31          | 64
-  PPC   | KVM_REG_PPC_TM_VSR0           | 128
-          ...
-  PPC   | KVM_REG_PPC_TM_VSR63          | 128
-  PPC   | KVM_REG_PPC_TM_CR             | 64
-  PPC   | KVM_REG_PPC_TM_LR             | 64
-  PPC   | KVM_REG_PPC_TM_CTR            | 64
-  PPC   | KVM_REG_PPC_TM_FPSCR          | 64
-  PPC   | KVM_REG_PPC_TM_AMR            | 64
-  PPC   | KVM_REG_PPC_TM_PPR            | 64
-  PPC   | KVM_REG_PPC_TM_VRSAVE         | 64
-  PPC   | KVM_REG_PPC_TM_VSCR           | 32
-  PPC   | KVM_REG_PPC_TM_DSCR           | 64
-  PPC   | KVM_REG_PPC_TM_TAR            | 64
-  PPC   | KVM_REG_PPC_TM_XER            | 64
-        |                               |
-  MIPS  | KVM_REG_MIPS_R0               | 64
-          ...
-  MIPS  | KVM_REG_MIPS_R31              | 64
-  MIPS  | KVM_REG_MIPS_HI               | 64
-  MIPS  | KVM_REG_MIPS_LO               | 64
-  MIPS  | KVM_REG_MIPS_PC               | 64
-  MIPS  | KVM_REG_MIPS_CP0_INDEX        | 32
-  MIPS  | KVM_REG_MIPS_CP0_ENTRYLO0     | 64
-  MIPS  | KVM_REG_MIPS_CP0_ENTRYLO1     | 64
-  MIPS  | KVM_REG_MIPS_CP0_CONTEXT      | 64
-  MIPS  | KVM_REG_MIPS_CP0_CONTEXTCONFIG| 32
-  MIPS  | KVM_REG_MIPS_CP0_USERLOCAL    | 64
-  MIPS  | KVM_REG_MIPS_CP0_XCONTEXTCONFIG| 64
-  MIPS  | KVM_REG_MIPS_CP0_PAGEMASK     | 32
-  MIPS  | KVM_REG_MIPS_CP0_PAGEGRAIN    | 32
-  MIPS  | KVM_REG_MIPS_CP0_SEGCTL0      | 64
-  MIPS  | KVM_REG_MIPS_CP0_SEGCTL1      | 64
-  MIPS  | KVM_REG_MIPS_CP0_SEGCTL2      | 64
-  MIPS  | KVM_REG_MIPS_CP0_PWBASE       | 64
-  MIPS  | KVM_REG_MIPS_CP0_PWFIELD      | 64
-  MIPS  | KVM_REG_MIPS_CP0_PWSIZE       | 64
-  MIPS  | KVM_REG_MIPS_CP0_WIRED        | 32
-  MIPS  | KVM_REG_MIPS_CP0_PWCTL        | 32
-  MIPS  | KVM_REG_MIPS_CP0_HWRENA       | 32
-  MIPS  | KVM_REG_MIPS_CP0_BADVADDR     | 64
-  MIPS  | KVM_REG_MIPS_CP0_BADINSTR     | 32
-  MIPS  | KVM_REG_MIPS_CP0_BADINSTRP    | 32
-  MIPS  | KVM_REG_MIPS_CP0_COUNT        | 32
-  MIPS  | KVM_REG_MIPS_CP0_ENTRYHI      | 64
-  MIPS  | KVM_REG_MIPS_CP0_COMPARE      | 32
-  MIPS  | KVM_REG_MIPS_CP0_STATUS       | 32
-  MIPS  | KVM_REG_MIPS_CP0_INTCTL       | 32
-  MIPS  | KVM_REG_MIPS_CP0_CAUSE        | 32
-  MIPS  | KVM_REG_MIPS_CP0_EPC          | 64
-  MIPS  | KVM_REG_MIPS_CP0_PRID         | 32
-  MIPS  | KVM_REG_MIPS_CP0_EBASE        | 64
-  MIPS  | KVM_REG_MIPS_CP0_CONFIG       | 32
-  MIPS  | KVM_REG_MIPS_CP0_CONFIG1      | 32
-  MIPS  | KVM_REG_MIPS_CP0_CONFIG2      | 32
-  MIPS  | KVM_REG_MIPS_CP0_CONFIG3      | 32
-  MIPS  | KVM_REG_MIPS_CP0_CONFIG4      | 32
-  MIPS  | KVM_REG_MIPS_CP0_CONFIG5      | 32
-  MIPS  | KVM_REG_MIPS_CP0_CONFIG7      | 32
-  MIPS  | KVM_REG_MIPS_CP0_XCONTEXT     | 64
-  MIPS  | KVM_REG_MIPS_CP0_ERROREPC     | 64
-  MIPS  | KVM_REG_MIPS_CP0_KSCRATCH1    | 64
-  MIPS  | KVM_REG_MIPS_CP0_KSCRATCH2    | 64
-  MIPS  | KVM_REG_MIPS_CP0_KSCRATCH3    | 64
-  MIPS  | KVM_REG_MIPS_CP0_KSCRATCH4    | 64
-  MIPS  | KVM_REG_MIPS_CP0_KSCRATCH5    | 64
-  MIPS  | KVM_REG_MIPS_CP0_KSCRATCH6    | 64
-  MIPS  | KVM_REG_MIPS_CP0_MAAR(0..63)  | 64
-  MIPS  | KVM_REG_MIPS_COUNT_CTL        | 64
-  MIPS  | KVM_REG_MIPS_COUNT_RESUME     | 64
-  MIPS  | KVM_REG_MIPS_COUNT_HZ         | 64
-  MIPS  | KVM_REG_MIPS_FPR_32(0..31)    | 32
-  MIPS  | KVM_REG_MIPS_FPR_64(0..31)    | 64
-  MIPS  | KVM_REG_MIPS_VEC_128(0..31)   | 128
-  MIPS  | KVM_REG_MIPS_FCR_IR           | 32
-  MIPS  | KVM_REG_MIPS_FCR_CSR          | 32
-  MIPS  | KVM_REG_MIPS_MSA_IR           | 32
-  MIPS  | KVM_REG_MIPS_MSA_CSR          | 32
+  ======= =============================== ============
+  Arch              Register              Width (bits)
+  ======= =============================== ============
+  PPC     KVM_REG_PPC_HIOR                64
+  PPC     KVM_REG_PPC_IAC1                64
+  PPC     KVM_REG_PPC_IAC2                64
+  PPC     KVM_REG_PPC_IAC3                64
+  PPC     KVM_REG_PPC_IAC4                64
+  PPC     KVM_REG_PPC_DAC1                64
+  PPC     KVM_REG_PPC_DAC2                64
+  PPC     KVM_REG_PPC_DABR                64
+  PPC     KVM_REG_PPC_DSCR                64
+  PPC     KVM_REG_PPC_PURR                64
+  PPC     KVM_REG_PPC_SPURR               64
+  PPC     KVM_REG_PPC_DAR                 64
+  PPC     KVM_REG_PPC_DSISR               32
+  PPC     KVM_REG_PPC_AMR                 64
+  PPC     KVM_REG_PPC_UAMOR               64
+  PPC     KVM_REG_PPC_MMCR0               64
+  PPC     KVM_REG_PPC_MMCR1               64
+  PPC     KVM_REG_PPC_MMCRA               64
+  PPC     KVM_REG_PPC_MMCR2               64
+  PPC     KVM_REG_PPC_MMCRS               64
+  PPC     KVM_REG_PPC_SIAR                64
+  PPC     KVM_REG_PPC_SDAR                64
+  PPC     KVM_REG_PPC_SIER                64
+  PPC     KVM_REG_PPC_PMC1                32
+  PPC     KVM_REG_PPC_PMC2                32
+  PPC     KVM_REG_PPC_PMC3                32
+  PPC     KVM_REG_PPC_PMC4                32
+  PPC     KVM_REG_PPC_PMC5                32
+  PPC     KVM_REG_PPC_PMC6                32
+  PPC     KVM_REG_PPC_PMC7                32
+  PPC     KVM_REG_PPC_PMC8                32
+  PPC     KVM_REG_PPC_FPR0                64
+  ...
+  PPC     KVM_REG_PPC_FPR31               64
+  PPC     KVM_REG_PPC_VR0                 128
+  ...
+  PPC     KVM_REG_PPC_VR31                128
+  PPC     KVM_REG_PPC_VSR0                128
+  ...
+  PPC     KVM_REG_PPC_VSR31               128
+  PPC     KVM_REG_PPC_FPSCR               64
+  PPC     KVM_REG_PPC_VSCR                32
+  PPC     KVM_REG_PPC_VPA_ADDR            64
+  PPC     KVM_REG_PPC_VPA_SLB             128
+  PPC     KVM_REG_PPC_VPA_DTL             128
+  PPC     KVM_REG_PPC_EPCR                32
+  PPC     KVM_REG_PPC_EPR                 32
+  PPC     KVM_REG_PPC_TCR                 32
+  PPC     KVM_REG_PPC_TSR                 32
+  PPC     KVM_REG_PPC_OR_TSR              32
+  PPC     KVM_REG_PPC_CLEAR_TSR           32
+  PPC     KVM_REG_PPC_MAS0                32
+  PPC     KVM_REG_PPC_MAS1                32
+  PPC     KVM_REG_PPC_MAS2                64
+  PPC     KVM_REG_PPC_MAS7_3              64
+  PPC     KVM_REG_PPC_MAS4                32
+  PPC     KVM_REG_PPC_MAS6                32
+  PPC     KVM_REG_PPC_MMUCFG              32
+  PPC     KVM_REG_PPC_TLB0CFG             32
+  PPC     KVM_REG_PPC_TLB1CFG             32
+  PPC     KVM_REG_PPC_TLB2CFG             32
+  PPC     KVM_REG_PPC_TLB3CFG             32
+  PPC     KVM_REG_PPC_TLB0PS              32
+  PPC     KVM_REG_PPC_TLB1PS              32
+  PPC     KVM_REG_PPC_TLB2PS              32
+  PPC     KVM_REG_PPC_TLB3PS              32
+  PPC     KVM_REG_PPC_EPTCFG              32
+  PPC     KVM_REG_PPC_ICP_STATE           64
+  PPC     KVM_REG_PPC_VP_STATE            128
+  PPC     KVM_REG_PPC_TB_OFFSET           64
+  PPC     KVM_REG_PPC_SPMC1               32
+  PPC     KVM_REG_PPC_SPMC2               32
+  PPC     KVM_REG_PPC_IAMR                64
+  PPC     KVM_REG_PPC_TFHAR               64
+  PPC     KVM_REG_PPC_TFIAR               64
+  PPC     KVM_REG_PPC_TEXASR              64
+  PPC     KVM_REG_PPC_FSCR                64
+  PPC     KVM_REG_PPC_PSPB                32
+  PPC     KVM_REG_PPC_EBBHR               64
+  PPC     KVM_REG_PPC_EBBRR               64
+  PPC     KVM_REG_PPC_BESCR               64
+  PPC     KVM_REG_PPC_TAR                 64
+  PPC     KVM_REG_PPC_DPDES               64
+  PPC     KVM_REG_PPC_DAWR                64
+  PPC     KVM_REG_PPC_DAWRX               64
+  PPC     KVM_REG_PPC_CIABR               64
+  PPC     KVM_REG_PPC_IC                  64
+  PPC     KVM_REG_PPC_VTB                 64
+  PPC     KVM_REG_PPC_CSIGR               64
+  PPC     KVM_REG_PPC_TACR                64
+  PPC     KVM_REG_PPC_TCSCR               64
+  PPC     KVM_REG_PPC_PID                 64
+  PPC     KVM_REG_PPC_ACOP                64
+  PPC     KVM_REG_PPC_VRSAVE              32
+  PPC     KVM_REG_PPC_LPCR                32
+  PPC     KVM_REG_PPC_LPCR_64             64
+  PPC     KVM_REG_PPC_PPR                 64
+  PPC     KVM_REG_PPC_ARCH_COMPAT         32
+  PPC     KVM_REG_PPC_DABRX               32
+  PPC     KVM_REG_PPC_WORT                64
+  PPC    KVM_REG_PPC_SPRG9               64
+  PPC    KVM_REG_PPC_DBSR                32
+  PPC     KVM_REG_PPC_TIDR                64
+  PPC     KVM_REG_PPC_PSSCR               64
+  PPC     KVM_REG_PPC_DEC_EXPIRY          64
+  PPC     KVM_REG_PPC_PTCR                64
+  PPC     KVM_REG_PPC_TM_GPR0             64
+  ...
+  PPC     KVM_REG_PPC_TM_GPR31            64
+  PPC     KVM_REG_PPC_TM_VSR0             128
+  ...
+  PPC     KVM_REG_PPC_TM_VSR63            128
+  PPC     KVM_REG_PPC_TM_CR               64
+  PPC     KVM_REG_PPC_TM_LR               64
+  PPC     KVM_REG_PPC_TM_CTR              64
+  PPC     KVM_REG_PPC_TM_FPSCR            64
+  PPC     KVM_REG_PPC_TM_AMR              64
+  PPC     KVM_REG_PPC_TM_PPR              64
+  PPC     KVM_REG_PPC_TM_VRSAVE           64
+  PPC     KVM_REG_PPC_TM_VSCR             32
+  PPC     KVM_REG_PPC_TM_DSCR             64
+  PPC     KVM_REG_PPC_TM_TAR              64
+  PPC     KVM_REG_PPC_TM_XER              64
+
+  MIPS    KVM_REG_MIPS_R0                 64
+  ...
+  MIPS    KVM_REG_MIPS_R31                64
+  MIPS    KVM_REG_MIPS_HI                 64
+  MIPS    KVM_REG_MIPS_LO                 64
+  MIPS    KVM_REG_MIPS_PC                 64
+  MIPS    KVM_REG_MIPS_CP0_INDEX          32
+  MIPS    KVM_REG_MIPS_CP0_ENTRYLO0       64
+  MIPS    KVM_REG_MIPS_CP0_ENTRYLO1       64
+  MIPS    KVM_REG_MIPS_CP0_CONTEXT        64
+  MIPS    KVM_REG_MIPS_CP0_CONTEXTCONFIG  32
+  MIPS    KVM_REG_MIPS_CP0_USERLOCAL      64
+  MIPS    KVM_REG_MIPS_CP0_XCONTEXTCONFIG 64
+  MIPS    KVM_REG_MIPS_CP0_PAGEMASK       32
+  MIPS    KVM_REG_MIPS_CP0_PAGEGRAIN      32
+  MIPS    KVM_REG_MIPS_CP0_SEGCTL0        64
+  MIPS    KVM_REG_MIPS_CP0_SEGCTL1        64
+  MIPS    KVM_REG_MIPS_CP0_SEGCTL2        64
+  MIPS    KVM_REG_MIPS_CP0_PWBASE         64
+  MIPS    KVM_REG_MIPS_CP0_PWFIELD        64
+  MIPS    KVM_REG_MIPS_CP0_PWSIZE         64
+  MIPS    KVM_REG_MIPS_CP0_WIRED          32
+  MIPS    KVM_REG_MIPS_CP0_PWCTL          32
+  MIPS    KVM_REG_MIPS_CP0_HWRENA         32
+  MIPS    KVM_REG_MIPS_CP0_BADVADDR       64
+  MIPS    KVM_REG_MIPS_CP0_BADINSTR       32
+  MIPS    KVM_REG_MIPS_CP0_BADINSTRP      32
+  MIPS    KVM_REG_MIPS_CP0_COUNT          32
+  MIPS    KVM_REG_MIPS_CP0_ENTRYHI        64
+  MIPS    KVM_REG_MIPS_CP0_COMPARE        32
+  MIPS    KVM_REG_MIPS_CP0_STATUS         32
+  MIPS    KVM_REG_MIPS_CP0_INTCTL         32
+  MIPS    KVM_REG_MIPS_CP0_CAUSE          32
+  MIPS    KVM_REG_MIPS_CP0_EPC            64
+  MIPS    KVM_REG_MIPS_CP0_PRID           32
+  MIPS    KVM_REG_MIPS_CP0_EBASE          64
+  MIPS    KVM_REG_MIPS_CP0_CONFIG         32
+  MIPS    KVM_REG_MIPS_CP0_CONFIG1        32
+  MIPS    KVM_REG_MIPS_CP0_CONFIG2        32
+  MIPS    KVM_REG_MIPS_CP0_CONFIG3        32
+  MIPS    KVM_REG_MIPS_CP0_CONFIG4        32
+  MIPS    KVM_REG_MIPS_CP0_CONFIG5        32
+  MIPS    KVM_REG_MIPS_CP0_CONFIG7        32
+  MIPS    KVM_REG_MIPS_CP0_XCONTEXT       64
+  MIPS    KVM_REG_MIPS_CP0_ERROREPC       64
+  MIPS    KVM_REG_MIPS_CP0_KSCRATCH1      64
+  MIPS    KVM_REG_MIPS_CP0_KSCRATCH2      64
+  MIPS    KVM_REG_MIPS_CP0_KSCRATCH3      64
+  MIPS    KVM_REG_MIPS_CP0_KSCRATCH4      64
+  MIPS    KVM_REG_MIPS_CP0_KSCRATCH5      64
+  MIPS    KVM_REG_MIPS_CP0_KSCRATCH6      64
+  MIPS    KVM_REG_MIPS_CP0_MAAR(0..63)    64
+  MIPS    KVM_REG_MIPS_COUNT_CTL          64
+  MIPS    KVM_REG_MIPS_COUNT_RESUME       64
+  MIPS    KVM_REG_MIPS_COUNT_HZ           64
+  MIPS    KVM_REG_MIPS_FPR_32(0..31)      32
+  MIPS    KVM_REG_MIPS_FPR_64(0..31)      64
+  MIPS    KVM_REG_MIPS_VEC_128(0..31)     128
+  MIPS    KVM_REG_MIPS_FCR_IR             32
+  MIPS    KVM_REG_MIPS_FCR_CSR            32
+  MIPS    KVM_REG_MIPS_MSA_IR             32
+  MIPS    KVM_REG_MIPS_MSA_CSR            32
+  ======= =============================== ============
  
  ARM registers are mapped using the lower 32 bits.  The upper 16 of that
  is the register group type, or coprocessor number:
  
-ARM core registers have the following id bit patterns:
+ARM core registers have the following id bit patterns::
+
    0x4020 0000 0010 <index into the kvm_regs struct:16>
  
-ARM 32-bit CP15 registers have the following id bit patterns:
+ARM 32-bit CP15 registers have the following id bit patterns::
+
    0x4020 0000 000F <zero:1> <crn:4> <crm:4> <opc1:4> <opc2:3>
  
-ARM 64-bit CP15 registers have the following id bit patterns:
+ARM 64-bit CP15 registers have the following id bit patterns::
+
    0x4030 0000 000F <zero:1> <zero:4> <crm:4> <opc1:4> <zero:3>
  
-ARM CCSIDR registers are demultiplexed by CSSELR value:
+ARM CCSIDR registers are demultiplexed by CSSELR value::
+
    0x4020 0000 0011 00 <csselr:8>
  
-ARM 32-bit VFP control registers have the following id bit patterns:
+ARM 32-bit VFP control registers have the following id bit patterns::
+
    0x4020 0000 0012 1 <regno:12>
  
-ARM 64-bit FP registers have the following id bit patterns:
+ARM 64-bit FP registers have the following id bit patterns::
+
    0x4030 0000 0012 0 <regno:12>
  
-ARM firmware pseudo-registers have the following bit pattern:
+ARM firmware pseudo-registers have the following bit pattern::
+
    0x4030 0000 0014 <regno:16>
  
  
@@ -2156,15 +2368,18 @@ that is the register group type, or coprocessor number:
  arm64 core/FP-SIMD registers have the following id bit patterns. Note
  that the size of the access is variable, as the kvm_regs structure
  contains elements ranging from 32 to 128 bits. The index is a 32bit
-value in the kvm_regs structure seen as a 32bit array.
+value in the kvm_regs structure seen as a 32bit array::
+
    0x60x0 0000 0010 <index into the kvm_regs struct:16>
  
  Specifically:
+
+======================= ========= ===== =======================================
      Encoding            Register  Bits  kvm_regs member
-----------------------------------------------------------------
+======================= ========= ===== =======================================
    0x6030 0000 0010 0000 X0          64  regs.regs[0]
    0x6030 0000 0010 0002 X1          64  regs.regs[1]
-    ...
+  ...
    0x6030 0000 0010 003c X30         64  regs.regs[30]
    0x6030 0000 0010 003e SP          64  regs.sp
    0x6030 0000 0010 0040 PC          64  regs.pc
@@ -2176,27 +2391,31 @@ Specifically:
    0x6030 0000 0010 004c SPSR_UND    64  spsr[KVM_SPSR_UND]
    0x6030 0000 0010 004e SPSR_IRQ    64  spsr[KVM_SPSR_IRQ]
    0x6060 0000 0010 0050 SPSR_FIQ    64  spsr[KVM_SPSR_FIQ]
-  0x6040 0000 0010 0054 V0         128  fp_regs.vregs[0]    (*)
-  0x6040 0000 0010 0058 V1         128  fp_regs.vregs[1]    (*)
-    ...
-  0x6040 0000 0010 00d0 V31        128  fp_regs.vregs[31]   (*)
+  0x6040 0000 0010 0054 V0         128  fp_regs.vregs[0]    [1]_
+  0x6040 0000 0010 0058 V1         128  fp_regs.vregs[1]    [1]_
+  ...
+  0x6040 0000 0010 00d0 V31        128  fp_regs.vregs[31]   [1]_
    0x6020 0000 0010 00d4 FPSR        32  fp_regs.fpsr
    0x6020 0000 0010 00d5 FPCR        32  fp_regs.fpcr
+======================= ========= ===== =======================================
+
+.. [1] These encodings are not accepted for SVE-enabled vcpus.  See
+       KVM_ARM_VCPU_INIT.
  
-(*) These encodings are not accepted for SVE-enabled vcpus.  See
-    KVM_ARM_VCPU_INIT.
+       The equivalent register content can be accessed via bits [127:0] of
+       the corresponding SVE Zn registers instead for vcpus that have SVE
+       enabled (see below).
  
-    The equivalent register content can be accessed via bits [127:0] of
-    the corresponding SVE Zn registers instead for vcpus that have SVE
-    enabled (see below).
+arm64 CCSIDR registers are demultiplexed by CSSELR value::
  
-arm64 CCSIDR registers are demultiplexed by CSSELR value:
    0x6020 0000 0011 00 <csselr:8>
  
-arm64 system registers have the following id bit patterns:
+arm64 system registers have the following id bit patterns::
+
    0x6030 0000 0013 <op0:2> <op1:3> <crn:4> <crm:4> <op2:3>
  
-WARNING:
+.. warning::
+
       Two system register IDs do not follow the specified pattern.  These
       are KVM_REG_ARM_TIMER_CVAL and KVM_REG_ARM_TIMER_CNT, which map to
       system registers CNTV_CVAL_EL0 and CNTVCT_EL0 respectively.  These
@@ -2205,10 +2424,12 @@ WARNING:
       derived from the register encoding for CNTV_CVAL_EL0.  As this is
       API, it must remain this way.
  
-arm64 firmware pseudo-registers have the following bit pattern:
+arm64 firmware pseudo-registers have the following bit pattern::
+
    0x6030 0000 0014 <regno:16>
  
-arm64 SVE registers have the following bit patterns:
+arm64 SVE registers have the following bit patterns::
+
    0x6080 0000 0015 00 <n:5> <slice:5>   Zn bits[2048*slice + 2047 : 2048*slice]
    0x6050 0000 0015 04 <n:4> <slice:5>   Pn bits[256*slice + 255 : 256*slice]
    0x6050 0000 0015 060 <slice:5>        FFR bits[256*slice + 255 : 256*slice]
@@ -2216,7 +2437,7 @@ arm64 SVE registers have the following bit patterns:
  
  Access to register IDs where 2048 * slice >= 128 * max_vq will fail with
  ENOENT.  max_vq is the vcpu's maximum supported vector length in 128-bit
-quadwords: see (**) below.
+quadwords: see [2]_ below.
  
  These registers are only accessible on vcpus for which SVE is enabled.
  See KVM_ARM_VCPU_INIT for details.
@@ -2231,21 +2452,21 @@ lengths supported by the vcpu to be discovered and configured by
  userspace.  When transferred to or from user memory via KVM_GET_ONE_REG
  or KVM_SET_ONE_REG, the value of this register is of type
  __u64[KVM_ARM64_SVE_VLS_WORDS], and encodes the set of vector lengths as
-follows:
+follows::
  
-__u64 vector_lengths[KVM_ARM64_SVE_VLS_WORDS];
+  __u64 vector_lengths[KVM_ARM64_SVE_VLS_WORDS];
  
-if (vq >= SVE_VQ_MIN && vq <= SVE_VQ_MAX &&
-    ((vector_lengths[(vq - KVM_ARM64_SVE_VQ_MIN) / 64] >>
+  if (vq >= SVE_VQ_MIN && vq <= SVE_VQ_MAX &&
+      ((vector_lengths[(vq - KVM_ARM64_SVE_VQ_MIN) / 64] >>
                 ((vq - KVM_ARM64_SVE_VQ_MIN) % 64)) & 1))
         /* Vector length vq * 16 bytes supported */
-else
+  else
         /* Vector length vq * 16 bytes not supported */
  
-(**) The maximum value vq for which the above condition is true is
-max_vq.  This is the maximum vector length available to the guest on
-this vcpu, and determines which register slices are visible through
-this ioctl interface.
+.. [2] The maximum value vq for which the above condition is true is
+       max_vq.  This is the maximum vector length available to the guest on
+       this vcpu, and determines which register slices are visible through
+       this ioctl interface.
  
  (See Documentation/arm64/sve.rst for an explanation of the "vq"
  nomenclature.)
@@ -2270,11 +2491,13 @@ write this register will fail with EPERM.
  MIPS registers are mapped using the lower 32 bits.  The upper 16 of that is
  the register group type:
  
-MIPS core registers (see above) have the following id bit patterns:
+MIPS core registers (see above) have the following id bit patterns::
+
    0x7030 0000 0000 <reg:16>
  
  MIPS CP0 registers (see KVM_REG_MIPS_CP0_* above) have the following id bit
-patterns depending on whether they're 32-bit or 64-bit registers:
+patterns depending on whether they're 32-bit or 64-bit registers::
+
    0x7020 0000 0001 00 <reg:5> <sel:3>   (32-bit)
    0x7030 0000 0001 00 <reg:5> <sel:3>   (64-bit)
  
@@ -2285,10 +2508,12 @@ with the RI and XI bits (if they exist) in bits 63 and 62 respectively, and
  the PFNX field starting at bit 30.
  
  MIPS MAARs (see KVM_REG_MIPS_CP0_MAAR(*) above) have the following id bit
-patterns:
+patterns::
+
    0x7030 0000 0001 01 <reg:8>
  
-MIPS KVM control registers (see above) have the following id bit patterns:
+MIPS KVM control registers (see above) have the following id bit patterns::
+
    0x7030 0000 0002 <reg:16>
  
  MIPS FPU registers (see KVM_REG_MIPS_FPR_{32,64}() above) have the following
@@ -2297,31 +2522,40 @@ always accessed according to the current guest FPU mode (Status.FR and
  Config5.FRE), i.e. as the guest would see them, and they become unpredictable
  if the guest FPU mode is changed. MIPS SIMD Architecture (MSA) vector
  registers (see KVM_REG_MIPS_VEC_128() above) have similar patterns as they
-overlap the FPU registers:
+overlap the FPU registers::
+
    0x7020 0000 0003 00 <0:3> <reg:5> (32-bit FPU registers)
    0x7030 0000 0003 00 <0:3> <reg:5> (64-bit FPU registers)
    0x7040 0000 0003 00 <0:3> <reg:5> (128-bit MSA vector registers)
  
  MIPS FPU control registers (see KVM_REG_MIPS_FCR_{IR,CSR} above) have the
-following id bit patterns:
+following id bit patterns::
+
    0x7020 0000 0003 01 <0:3> <reg:5>
  
  MIPS MSA control registers (see KVM_REG_MIPS_MSA_{IR,CSR} above) have the
-following id bit patterns:
+following id bit patterns::
+
    0x7020 0000 0003 02 <0:3> <reg:5>
  
  
  4.69 KVM_GET_ONE_REG
+--------------------
+
+:Capability: KVM_CAP_ONE_REG
+:Architectures: all
+:Type: vcpu ioctl
+:Parameters: struct kvm_one_reg (in and out)
+:Returns: 0 on success, negative value on failure
  
-Capability: KVM_CAP_ONE_REG
-Architectures: all
-Type: vcpu ioctl
-Parameters: struct kvm_one_reg (in and out)
-Returns: 0 on success, negative value on failure
  Errors include:
-  ENOENT:   no such register
-  EINVAL:   invalid register ID, or no such register
-  EPERM:    (arm64) register access not allowed before vcpu finalization
+
+  ======== ============================================================
+  ENOENT   no such register
+  EINVAL   invalid register ID, or no such register
+  EPERM    (arm64) register access not allowed before vcpu finalization
+  ======== ============================================================
+
  (These error codes are indicative only: do not rely on a specific error
  code being returned in a specific situation.)
  
@@ -2335,12 +2569,13 @@ list in 4.68.
  
  
  4.70 KVM_KVMCLOCK_CTRL
+----------------------
  
-Capability: KVM_CAP_KVMCLOCK_CTRL
-Architectures: Any that implement pvclocks (currently x86 only)
-Type: vcpu ioctl
-Parameters: None
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_KVMCLOCK_CTRL
+:Architectures: Any that implement pvclocks (currently x86 only)
+:Type: vcpu ioctl
+:Parameters: None
+:Returns: 0 on success, -1 on error
  
  This signals to the host kernel that the specified guest is being paused by
  userspace.  The host will set a flag in the pvclock structure that is checked
@@ -2356,26 +2591,30 @@ after pausing the vcpu, but before it is resumed.
  
  
  4.71 KVM_SIGNAL_MSI
+-------------------
  
-Capability: KVM_CAP_SIGNAL_MSI
-Architectures: x86 arm arm64
-Type: vm ioctl
-Parameters: struct kvm_msi (in)
-Returns: >0 on delivery, 0 if guest blocked the MSI, and -1 on error
+:Capability: KVM_CAP_SIGNAL_MSI
+:Architectures: x86 arm arm64
+:Type: vm ioctl
+:Parameters: struct kvm_msi (in)
+:Returns: >0 on delivery, 0 if guest blocked the MSI, and -1 on error
  
  Directly inject a MSI message. Only valid with in-kernel irqchip that handles
  MSI messages.
  
-struct kvm_msi {
+::
+
+  struct kvm_msi {
         __u32 address_lo;
         __u32 address_hi;
         __u32 data;
         __u32 flags;
         __u32 devid;
         __u8  pad[12];
-};
+  };
  
-flags: KVM_MSI_VALID_DEVID: devid contains a valid value.  The per-VM
+flags:
+  KVM_MSI_VALID_DEVID: devid contains a valid value.  The per-VM
    KVM_CAP_MSI_DEVID capability advertises the requirement to provide
    the device ID.  If this capability is not available, userspace
    should never set the KVM_MSI_VALID_DEVID flag as the ioctl might fail.
@@ -2391,30 +2630,31 @@ address_hi must be zero.
  
  
  4.71 KVM_CREATE_PIT2
+--------------------
  
-Capability: KVM_CAP_PIT2
-Architectures: x86
-Type: vm ioctl
-Parameters: struct kvm_pit_config (in)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_PIT2
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: struct kvm_pit_config (in)
+:Returns: 0 on success, -1 on error
  
  Creates an in-kernel device model for the i8254 PIT. This call is only valid
  after enabling in-kernel irqchip support via KVM_CREATE_IRQCHIP. The following
-parameters have to be passed:
+parameters have to be passed::
  
-struct kvm_pit_config {
+  struct kvm_pit_config {
         __u32 flags;
         __u32 pad[15];
-};
+  };
  
-Valid flags are:
+Valid flags are::
  
-#define KVM_PIT_SPEAKER_DUMMY     1 /* emulate speaker port stub */
+  #define KVM_PIT_SPEAKER_DUMMY     1 /* emulate speaker port stub */
  
  PIT timer interrupts may use a per-VM kernel thread for injection. If it
-exists, this thread will have a name of the following pattern:
+exists, this thread will have a name of the following pattern::
  
-kvm-pit/<owner-process-pid>
+  kvm-pit/<owner-process-pid>
  
  When running a guest with elevated priorities, the scheduling parameters of
  this thread may have to be adjusted accordingly.
@@ -2423,37 +2663,39 @@ This IOCTL replaces the obsolete KVM_CREATE_PIT.
  
  
  4.72 KVM_GET_PIT2
+-----------------
  
-Capability: KVM_CAP_PIT_STATE2
-Architectures: x86
-Type: vm ioctl
-Parameters: struct kvm_pit_state2 (out)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_PIT_STATE2
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: struct kvm_pit_state2 (out)
+:Returns: 0 on success, -1 on error
  
  Retrieves the state of the in-kernel PIT model. Only valid after
-KVM_CREATE_PIT2. The state is returned in the following structure:
+KVM_CREATE_PIT2. The state is returned in the following structure::
  
-struct kvm_pit_state2 {
+  struct kvm_pit_state2 {
         struct kvm_pit_channel_state channels[3];
         __u32 flags;
         __u32 reserved[9];
-};
+  };
  
-Valid flags are:
+Valid flags are::
  
-/* disable PIT in HPET legacy mode */
-#define KVM_PIT_FLAGS_HPET_LEGACY  0x00000001
+  /* disable PIT in HPET legacy mode */
+  #define KVM_PIT_FLAGS_HPET_LEGACY  0x00000001
  
  This IOCTL replaces the obsolete KVM_GET_PIT.
  
  
  4.73 KVM_SET_PIT2
+-----------------
  
-Capability: KVM_CAP_PIT_STATE2
-Architectures: x86
-Type: vm ioctl
-Parameters: struct kvm_pit_state2 (in)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_PIT_STATE2
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: struct kvm_pit_state2 (in)
+:Returns: 0 on success, -1 on error
  
  Sets the state of the in-kernel PIT model. Only valid after KVM_CREATE_PIT2.
  See KVM_GET_PIT2 for details on struct kvm_pit_state2.
@@ -2462,12 +2704,13 @@ This IOCTL replaces the obsolete KVM_SET_PIT.
  
  
  4.74 KVM_PPC_GET_SMMU_INFO
+--------------------------
  
-Capability: KVM_CAP_PPC_GET_SMMU_INFO
-Architectures: powerpc
-Type: vm ioctl
-Parameters: None
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_PPC_GET_SMMU_INFO
+:Architectures: powerpc
+:Type: vm ioctl
+:Parameters: None
+:Returns: 0 on success, -1 on error
  
  This populates and returns a structure describing the features of
  the "Server" class MMU emulation supported by KVM.
@@ -2475,7 +2718,7 @@ This can in turn be used by userspace to generate the appropriate
  device-tree properties for the guest operating system.
  
  The structure contains some global information, followed by an
-array of supported segment page sizes:
+array of supported segment page sizes::
  
        struct kvm_ppc_smmu_info {
              __u64 flags;
@@ -2503,7 +2746,7 @@ The "slb_size" field indicates how many SLB entries are supported
  
  The "sps" array contains 8 entries indicating the supported base
  page sizes for a segment in increasing order. Each entry is defined
-as follow:
+as follow::
  
     struct kvm_ppc_one_seg_page_size {
         __u32 page_shift;       /* Base page shift of segment (or 0) */
@@ -2524,7 +2767,7 @@ size provides the list of supported actual page sizes (which can be
  only larger or equal to the base page size), along with the
  corresponding encoding in the hash PTE. Similarly, the array is
  8 entries sorted by increasing sizes and an entry with a "0" shift
-is an empty entry and a terminator:
+is an empty entry and a terminator::
  
     struct kvm_ppc_one_page_size {
         __u32 page_shift;       /* Page shift (or 0) */
@@ -2536,12 +2779,13 @@ PTE's RPN field (ie, it needs to be shifted left by 12 to OR it
  into the hash PTE second double word).
  
  4.75 KVM_IRQFD
+--------------
  
-Capability: KVM_CAP_IRQFD
-Architectures: x86 s390 arm arm64
-Type: vm ioctl
-Parameters: struct kvm_irqfd (in)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_IRQFD
+:Architectures: x86 s390 arm arm64
+:Type: vm ioctl
+:Parameters: struct kvm_irqfd (in)
+:Returns: 0 on success, -1 on error
  
  Allows setting an eventfd to directly trigger a guest interrupt.
  kvm_irqfd.fd specifies the file descriptor to use as the eventfd and
@@ -2565,6 +2809,7 @@ irqfd.  The KVM_IRQFD_FLAG_RESAMPLE is only necessary on assignment
  and need not be specified with KVM_IRQFD_FLAG_DEASSIGN.
  
  On arm/arm64, gsi routing being supported, the following can happen:
+
  - in case no routing entry is associated to this gsi, injection fails
  - in case the gsi is associated to an irqchip routing entry,
    irqchip.pin + 32 corresponds to the injected SPI ID.
@@ -2573,12 +2818,13 @@ On arm/arm64, gsi routing being supported, the following can happen:
    to GICv3 ITS in-kernel emulation).
  
  4.76 KVM_PPC_ALLOCATE_HTAB
+--------------------------
  
-Capability: KVM_CAP_PPC_ALLOC_HTAB
-Architectures: powerpc
-Type: vm ioctl
-Parameters: Pointer to u32 containing hash table order (in/out)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_PPC_ALLOC_HTAB
+:Architectures: powerpc
+:Type: vm ioctl
+:Parameters: Pointer to u32 containing hash table order (in/out)
+:Returns: 0 on success, -1 on error
  
  This requests the host kernel to allocate an MMU hash table for a
  guest using the PAPR paravirtualization interface.  This only does
@@ -2609,75 +2855,88 @@ real-mode area (VRMA) facility, the kernel will re-create the VMRA
  HPTEs on the next KVM_RUN of any vcpu.
  
  4.77 KVM_S390_INTERRUPT
+-----------------------
  
-Capability: basic
-Architectures: s390
-Type: vm ioctl, vcpu ioctl
-Parameters: struct kvm_s390_interrupt (in)
-Returns: 0 on success, -1 on error
+:Capability: basic
+:Architectures: s390
+:Type: vm ioctl, vcpu ioctl
+:Parameters: struct kvm_s390_interrupt (in)
+:Returns: 0 on success, -1 on error
  
  Allows to inject an interrupt to the guest. Interrupts can be floating
  (vm ioctl) or per cpu (vcpu ioctl), depending on the interrupt type.
  
-Interrupt parameters are passed via kvm_s390_interrupt:
+Interrupt parameters are passed via kvm_s390_interrupt::
  
-struct kvm_s390_interrupt {
+  struct kvm_s390_interrupt {
         __u32 type;
         __u32 parm;
         __u64 parm64;
-};
+  };
  
  type can be one of the following:
  
-KVM_S390_SIGP_STOP (vcpu) - sigp stop; optional flags in parm
-KVM_S390_PROGRAM_INT (vcpu) - program check; code in parm
-KVM_S390_SIGP_SET_PREFIX (vcpu) - sigp set prefix; prefix address in parm
-KVM_S390_RESTART (vcpu) - restart
-KVM_S390_INT_CLOCK_COMP (vcpu) - clock comparator interrupt
-KVM_S390_INT_CPU_TIMER (vcpu) - CPU timer interrupt
-KVM_S390_INT_VIRTIO (vm) - virtio external interrupt; external interrupt
-                          parameters in parm and parm64
-KVM_S390_INT_SERVICE (vm) - sclp external interrupt; sclp parameter in parm
-KVM_S390_INT_EMERGENCY (vcpu) - sigp emergency; source cpu in parm
-KVM_S390_INT_EXTERNAL_CALL (vcpu) - sigp external call; source cpu in parm
-KVM_S390_INT_IO(ai,cssid,ssid,schid) (vm) - compound value to indicate an
-    I/O interrupt (ai - adapter interrupt; cssid,ssid,schid - subchannel);
-    I/O interruption parameters in parm (subchannel) and parm64 (intparm,
-    interruption subclass)
-KVM_S390_MCHK (vm, vcpu) - machine check interrupt; cr 14 bits in parm,
-                           machine check interrupt code in parm64 (note that
-                           machine checks needing further payload are not
-                           supported by this ioctl)
+KVM_S390_SIGP_STOP (vcpu)
+    - sigp stop; optional flags in parm
+KVM_S390_PROGRAM_INT (vcpu)
+    - program check; code in parm
+KVM_S390_SIGP_SET_PREFIX (vcpu)
+    - sigp set prefix; prefix address in parm
+KVM_S390_RESTART (vcpu)
+    - restart
+KVM_S390_INT_CLOCK_COMP (vcpu)
+    - clock comparator interrupt
+KVM_S390_INT_CPU_TIMER (vcpu)
+    - CPU timer interrupt
+KVM_S390_INT_VIRTIO (vm)
+    - virtio external interrupt; external interrupt
+      parameters in parm and parm64
+KVM_S390_INT_SERVICE (vm)
+    - sclp external interrupt; sclp parameter in parm
+KVM_S390_INT_EMERGENCY (vcpu)
+    - sigp emergency; source cpu in parm
+KVM_S390_INT_EXTERNAL_CALL (vcpu)
+    - sigp external call; source cpu in parm
+KVM_S390_INT_IO(ai,cssid,ssid,schid) (vm)
+    - compound value to indicate an
+      I/O interrupt (ai - adapter interrupt; cssid,ssid,schid - subchannel);
+      I/O interruption parameters in parm (subchannel) and parm64 (intparm,
+      interruption subclass)
+KVM_S390_MCHK (vm, vcpu)
+    - machine check interrupt; cr 14 bits in parm, machine check interrupt
+      code in parm64 (note that machine checks needing further payload are not
+      supported by this ioctl)
  
  This is an asynchronous vcpu ioctl and can be invoked from any thread.
  
  4.78 KVM_PPC_GET_HTAB_FD
+------------------------
  
-Capability: KVM_CAP_PPC_HTAB_FD
-Architectures: powerpc
-Type: vm ioctl
-Parameters: Pointer to struct kvm_get_htab_fd (in)
-Returns: file descriptor number (>= 0) on success, -1 on error
+:Capability: KVM_CAP_PPC_HTAB_FD
+:Architectures: powerpc
+:Type: vm ioctl
+:Parameters: Pointer to struct kvm_get_htab_fd (in)
+:Returns: file descriptor number (>= 0) on success, -1 on error
  
  This returns a file descriptor that can be used either to read out the
  entries in the guest's hashed page table (HPT), or to write entries to
  initialize the HPT.  The returned fd can only be written to if the
  KVM_GET_HTAB_WRITE bit is set in the flags field of the argument, and
  can only be read if that bit is clear.  The argument struct looks like
-this:
+this::
  
-/* For KVM_PPC_GET_HTAB_FD */
-struct kvm_get_htab_fd {
+  /* For KVM_PPC_GET_HTAB_FD */
+  struct kvm_get_htab_fd {
         __u64   flags;
         __u64   start_index;
         __u64   reserved[2];
-};
+  };
  
-/* Values for kvm_get_htab_fd.flags */
-#define KVM_GET_HTAB_BOLTED_ONLY       ((__u64)0x1)
-#define KVM_GET_HTAB_WRITE             ((__u64)0x2)
+  /* Values for kvm_get_htab_fd.flags */
+  #define KVM_GET_HTAB_BOLTED_ONLY     ((__u64)0x1)
+  #define KVM_GET_HTAB_WRITE           ((__u64)0x2)
  
-The `start_index' field gives the index in the HPT of the entry at
+The 'start_index' field gives the index in the HPT of the entry at
  which to start reading.  It is ignored when writing.
  
  Reads on the fd will initially supply information about all
@@ -2692,29 +2951,34 @@ Data read or written is structured as a header (8 bytes) followed by a
  series of valid HPT entries (16 bytes) each.  The header indicates how
  many valid HPT entries there are and how many invalid entries follow
  the valid entries.  The invalid entries are not represented explicitly
-in the stream.  The header format is:
+in the stream.  The header format is::
  
-struct kvm_get_htab_header {
+  struct kvm_get_htab_header {
         __u32   index;
         __u16   n_valid;
         __u16   n_invalid;
-};
+  };
  
  Writes to the fd create HPT entries starting at the index given in the
-header; first `n_valid' valid entries with contents from the data
-written, then `n_invalid' invalid entries, invalidating any previously
+header; first 'n_valid' valid entries with contents from the data
+written, then 'n_invalid' invalid entries, invalidating any previously
  valid entries found.
  
  4.79 KVM_CREATE_DEVICE
+----------------------
+
+:Capability: KVM_CAP_DEVICE_CTRL
+:Type: vm ioctl
+:Parameters: struct kvm_create_device (in/out)
+:Returns: 0 on success, -1 on error
  
-Capability: KVM_CAP_DEVICE_CTRL
-Type: vm ioctl
-Parameters: struct kvm_create_device (in/out)
-Returns: 0 on success, -1 on error
  Errors:
-  ENODEV: The device type is unknown or unsupported
-  EEXIST: Device already created, and this type of device may not
+
+  ======  =======================================================
+  ENODEV  The device type is unknown or unsupported
+  EEXIST  Device already created, and this type of device may not
            be instantiated multiple times
+  ======  =======================================================
  
    Other error conditions may be defined by individual device types or
    have their standard meanings.
@@ -2730,25 +2994,32 @@ Individual devices should not define flags.  Attributes should be used
  for specifying any behavior that is not implied by the device type
  number.
  
-struct kvm_create_device {
+::
+
+  struct kvm_create_device {
         __u32   type;   /* in: KVM_DEV_TYPE_xxx */
         __u32   fd;     /* out: device handle */
         __u32   flags;  /* in: KVM_CREATE_DEVICE_xxx */
-};
+  };
  
  4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
+--------------------------------------------
+
+:Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device,
+             KVM_CAP_VCPU_ATTRIBUTES for vcpu device
+:Type: device ioctl, vm ioctl, vcpu ioctl
+:Parameters: struct kvm_device_attr
+:Returns: 0 on success, -1 on error
  
-Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device,
-  KVM_CAP_VCPU_ATTRIBUTES for vcpu device
-Type: device ioctl, vm ioctl, vcpu ioctl
-Parameters: struct kvm_device_attr
-Returns: 0 on success, -1 on error
  Errors:
-  ENXIO:  The group or attribute is unknown/unsupported for this device
+
+  =====   =============================================================
+  ENXIO   The group or attribute is unknown/unsupported for this device
            or hardware support is missing.
-  EPERM:  The attribute cannot (currently) be accessed this way
+  EPERM   The attribute cannot (currently) be accessed this way
            (e.g. read-only attribute, or attribute that only makes
            sense when the device is in a different state)
+  =====   =============================================================
  
    Other error conditions may be defined by individual device types.
  
@@ -2757,23 +3028,30 @@ semantics are device-specific.  See individual device documentation in
  the "devices" directory.  As with ONE_REG, the size of the data
  transferred is defined by the particular attribute.
  
-struct kvm_device_attr {
+::
+
+  struct kvm_device_attr {
         __u32   flags;          /* no flags currently defined */
         __u32   group;          /* device-defined */
         __u64   attr;           /* group-defined */
         __u64   addr;           /* userspace address of attr data */
-};
+  };
  
  4.81 KVM_HAS_DEVICE_ATTR
+------------------------
+
+:Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device,
+            KVM_CAP_VCPU_ATTRIBUTES for vcpu device
+:Type: device ioctl, vm ioctl, vcpu ioctl
+:Parameters: struct kvm_device_attr
+:Returns: 0 on success, -1 on error
  
-Capability: KVM_CAP_DEVICE_CTRL, KVM_CAP_VM_ATTRIBUTES for vm device,
-  KVM_CAP_VCPU_ATTRIBUTES for vcpu device
-Type: device ioctl, vm ioctl, vcpu ioctl
-Parameters: struct kvm_device_attr
-Returns: 0 on success, -1 on error
  Errors:
-  ENXIO:  The group or attribute is unknown/unsupported for this device
+
+  =====   =============================================================
+  ENXIO   The group or attribute is unknown/unsupported for this device
            or hardware support is missing.
+  =====   =============================================================
  
  Tests whether a device supports a particular attribute.  A successful
  return indicates the attribute is implemented.  It does not necessarily
@@ -2781,15 +3059,20 @@ indicate that the attribute can be read or written in the device's
  current state.  "addr" is ignored.
  
  4.82 KVM_ARM_VCPU_INIT
+----------------------
+
+:Capability: basic
+:Architectures: arm, arm64
+:Type: vcpu ioctl
+:Parameters: struct kvm_vcpu_init (in)
+:Returns: 0 on success; -1 on error
  
-Capability: basic
-Architectures: arm, arm64
-Type: vcpu ioctl
-Parameters: struct kvm_vcpu_init (in)
-Returns: 0 on success; -1 on error
  Errors:
-  EINVAL:    the target is unknown, or the combination of features is invalid.
-  ENOENT:    a features bit specified is unknown.
+
+  ======     =================================================================
+  EINVAL     the target is unknown, or the combination of features is invalid.
+  ENOENT     a features bit specified is unknown.
+  ======     =================================================================
  
  This tells KVM what type of CPU to present to the guest, and what
  optional features it should have.  This will cause a reset of the cpu
@@ -2805,6 +3088,7 @@ state. All calls to this function after the initial call must use the same
  target and same set of feature flags, otherwise EINVAL will be returned.
  
  Possible features:
+
         - KVM_ARM_VCPU_POWER_OFF: Starts the CPU in a power-off state.
           Depends on KVM_CAP_ARM_PSCI.  If not set, the CPU will be powered on
           and execute guest code when KVM_RUN is called.
@@ -2861,14 +3145,19 @@ Possible features:
                 no longer be written using KVM_SET_ONE_REG.
  
  4.83 KVM_ARM_PREFERRED_TARGET
+-----------------------------
+
+:Capability: basic
+:Architectures: arm, arm64
+:Type: vm ioctl
+:Parameters: struct struct kvm_vcpu_init (out)
+:Returns: 0 on success; -1 on error
  
-Capability: basic
-Architectures: arm, arm64
-Type: vm ioctl
-Parameters: struct struct kvm_vcpu_init (out)
-Returns: 0 on success; -1 on error
  Errors:
-  ENODEV:    no preferred target available for the host
+
+  ======     ==========================================
+  ENODEV     no preferred target available for the host
+  ======     ==========================================
  
  This queries KVM for preferred CPU target type which can be emulated
  by KVM on underlying host.
@@ -2885,43 +3174,57 @@ in VCPU matching underlying host.
  
  
  4.84 KVM_GET_REG_LIST
+---------------------
+
+:Capability: basic
+:Architectures: arm, arm64, mips
+:Type: vcpu ioctl
+:Parameters: struct kvm_reg_list (in/out)
+:Returns: 0 on success; -1 on error
  
-Capability: basic
-Architectures: arm, arm64, mips
-Type: vcpu ioctl
-Parameters: struct kvm_reg_list (in/out)
-Returns: 0 on success; -1 on error
  Errors:
-  E2BIG:     the reg index list is too big to fit in the array specified by
+
+  =====      ==============================================================
+  E2BIG      the reg index list is too big to fit in the array specified by
               the user (the number required will be written into n).
+  =====      ==============================================================
+
+::
  
-struct kvm_reg_list {
+  struct kvm_reg_list {
         __u64 n; /* number of registers in reg[] */
         __u64 reg[0];
-};
+  };
  
  This ioctl returns the guest registers that are supported for the
  KVM_GET_ONE_REG/KVM_SET_ONE_REG calls.
  
  
  4.85 KVM_ARM_SET_DEVICE_ADDR (deprecated)
+-----------------------------------------
+
+:Capability: KVM_CAP_ARM_SET_DEVICE_ADDR
+:Architectures: arm, arm64
+:Type: vm ioctl
+:Parameters: struct kvm_arm_device_address (in)
+:Returns: 0 on success, -1 on error
  
-Capability: KVM_CAP_ARM_SET_DEVICE_ADDR
-Architectures: arm, arm64
-Type: vm ioctl
-Parameters: struct kvm_arm_device_address (in)
-Returns: 0 on success, -1 on error
  Errors:
-  ENODEV: The device id is unknown
-  ENXIO:  Device not supported on current system
-  EEXIST: Address already set
-  E2BIG:  Address outside guest physical address space
-  EBUSY:  Address overlaps with other device range
  
-struct kvm_arm_device_addr {
+  ======  ============================================
+  ENODEV  The device id is unknown
+  ENXIO   Device not supported on current system
+  EEXIST  Address already set
+  E2BIG   Address outside guest physical address space
+  EBUSY   Address overlaps with other device range
+  ======  ============================================
+
+::
+
+  struct kvm_arm_device_addr {
         __u64 id;
         __u64 addr;
-};
+  };
  
  Specify a device address in the guest's physical address space where guests
  can access emulated or directly exposed devices, which the host kernel needs
@@ -2929,7 +3232,7 @@ to know about. The id field is an architecture specific identifier for a
  specific device.
  
  ARM/arm64 divides the id field into two parts, a device id and an
-address type id specific to the individual device.
+address type id specific to the individual device::
  
    bits:  | 63        ...       32 | 31    ...    16 | 15    ...    0 |
    field: |        0x00000000      |     device id   |  addr type id  |
@@ -2947,12 +3250,13 @@ should be used instead.
  
  
  4.86 KVM_PPC_RTAS_DEFINE_TOKEN
+------------------------------
  
-Capability: KVM_CAP_PPC_RTAS
-Architectures: ppc
-Type: vm ioctl
-Parameters: struct kvm_rtas_token_args
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_PPC_RTAS
+:Architectures: ppc
+:Type: vm ioctl
+:Parameters: struct kvm_rtas_token_args
+:Returns: 0 on success, -1 on error
  
  Defines a token value for a RTAS (Run Time Abstraction Services)
  service in order to allow it to be handled in the kernel.  The
@@ -2966,18 +3270,21 @@ calls by the guest for that service will be passed to userspace to be
  handled.
  
  4.87 KVM_SET_GUEST_DEBUG
+------------------------
  
-Capability: KVM_CAP_SET_GUEST_DEBUG
-Architectures: x86, s390, ppc, arm64
-Type: vcpu ioctl
-Parameters: struct kvm_guest_debug (in)
-Returns: 0 on success; -1 on error
+:Capability: KVM_CAP_SET_GUEST_DEBUG
+:Architectures: x86, s390, ppc, arm64
+:Type: vcpu ioctl
+:Parameters: struct kvm_guest_debug (in)
+:Returns: 0 on success; -1 on error
  
-struct kvm_guest_debug {
+::
+
+  struct kvm_guest_debug {
         __u32 control;
         __u32 pad;
         struct kvm_guest_debug_arch arch;
-};
+  };
  
  Set up the processor specific debug registers and configure vcpu for
  handling guest debug events. There are two parts to the structure, the
@@ -3019,26 +3326,31 @@ KVM_EXIT_DEBUG with the kvm_debug_exit_arch part of the kvm_run
  structure containing architecture specific debug information.
  
  4.88 KVM_GET_EMULATED_CPUID
+---------------------------
+
+:Capability: KVM_CAP_EXT_EMUL_CPUID
+:Architectures: x86
+:Type: system ioctl
+:Parameters: struct kvm_cpuid2 (in/out)
+:Returns: 0 on success, -1 on error
  
-Capability: KVM_CAP_EXT_EMUL_CPUID
-Architectures: x86
-Type: system ioctl
-Parameters: struct kvm_cpuid2 (in/out)
-Returns: 0 on success, -1 on error
+::
  
-struct kvm_cpuid2 {
+  struct kvm_cpuid2 {
         __u32 nent;
         __u32 flags;
         struct kvm_cpuid_entry2 entries[0];
-};
+  };
  
  The member 'flags' is used for passing flags from userspace.
  
-#define KVM_CPUID_FLAG_SIGNIFCANT_INDEX                BIT(0)
-#define KVM_CPUID_FLAG_STATEFUL_FUNC           BIT(1)
-#define KVM_CPUID_FLAG_STATE_READ_NEXT         BIT(2)
+::
  
-struct kvm_cpuid_entry2 {
+  #define KVM_CPUID_FLAG_SIGNIFCANT_INDEX              BIT(0)
+  #define KVM_CPUID_FLAG_STATEFUL_FUNC         BIT(1)
+  #define KVM_CPUID_FLAG_STATE_READ_NEXT               BIT(2)
+
+  struct kvm_cpuid_entry2 {
         __u32 function;
         __u32 index;
         __u32 flags;
@@ -3047,7 +3359,7 @@ struct kvm_cpuid_entry2 {
         __u32 ecx;
         __u32 edx;
         __u32 padding[3];
-};
+  };
  
  This ioctl returns x86 cpuid features which are emulated by
  kvm.Userspace can use the information returned by this ioctl to query
@@ -3072,10 +3384,14 @@ emulated efficiently and thus not included here.
  
  The fields in each entry are defined as follows:
  
-  function: the eax value used to obtain the entry
-  index: the ecx value used to obtain the entry (for entries that are
+  function:
+        the eax value used to obtain the entry
+  index:
+        the ecx value used to obtain the entry (for entries that are
           affected by ecx)
-  flags: an OR of zero or more of the following:
+  flags:
+    an OR of zero or more of the following:
+
          KVM_CPUID_FLAG_SIGNIFCANT_INDEX:
             if the index field is valid
          KVM_CPUID_FLAG_STATEFUL_FUNC:
@@ -3085,24 +3401,28 @@ The fields in each entry are defined as follows:
          KVM_CPUID_FLAG_STATE_READ_NEXT:
             for KVM_CPUID_FLAG_STATEFUL_FUNC entries, set if this entry is
             the first entry to be read by a cpu
-   eax, ebx, ecx, edx: the values returned by the cpuid instruction for
+
+   eax, ebx, ecx, edx:
+
+         the values returned by the cpuid instruction for
           this function/index combination
  
  4.89 KVM_S390_MEM_OP
+--------------------
  
-Capability: KVM_CAP_S390_MEM_OP
-Architectures: s390
-Type: vcpu ioctl
-Parameters: struct kvm_s390_mem_op (in)
-Returns: = 0 on success,
-         < 0 on generic error (e.g. -EFAULT or -ENOMEM),
-         > 0 if an exception occurred while walking the page tables
+:Capability: KVM_CAP_S390_MEM_OP
+:Architectures: s390
+:Type: vcpu ioctl
+:Parameters: struct kvm_s390_mem_op (in)
+:Returns: = 0 on success,
+          < 0 on generic error (e.g. -EFAULT or -ENOMEM),
+          > 0 if an exception occurred while walking the page tables
  
  Read or write data from/to the logical (virtual) memory of a VCPU.
  
-Parameters are specified via the following structure:
+Parameters are specified via the following structure::
  
-struct kvm_s390_mem_op {
+  struct kvm_s390_mem_op {
         __u64 gaddr;            /* the guest address */
         __u64 flags;            /* flags */
         __u32 size;             /* amount of bytes */
@@ -3110,7 +3430,7 @@ struct kvm_s390_mem_op {
         __u64 buf;              /* buffer in userspace */
         __u8 ar;                /* the access register number */
         __u8 reserved[31];      /* should be set to 0 */
-};
+  };
  
  The type of operation is specified in the "op" field. It is either
  KVM_S390_MEMOP_LOGICAL_READ for reading from logical memory space or
@@ -3137,24 +3457,25 @@ The "reserved" field is meant for future extensions. It is not used by
  KVM with the currently defined set of flags.
  
  4.90 KVM_S390_GET_SKEYS
+-----------------------
  
-Capability: KVM_CAP_S390_SKEYS
-Architectures: s390
-Type: vm ioctl
-Parameters: struct kvm_s390_skeys
-Returns: 0 on success, KVM_S390_GET_KEYS_NONE if guest is not using storage
-         keys, negative value on error
+:Capability: KVM_CAP_S390_SKEYS
+:Architectures: s390
+:Type: vm ioctl
+:Parameters: struct kvm_s390_skeys
+:Returns: 0 on success, KVM_S390_GET_KEYS_NONE if guest is not using storage
+          keys, negative value on error
  
  This ioctl is used to get guest storage key values on the s390
-architecture. The ioctl takes parameters via the kvm_s390_skeys struct.
+architecture. The ioctl takes parameters via the kvm_s390_skeys struct::
  
-struct kvm_s390_skeys {
+  struct kvm_s390_skeys {
         __u64 start_gfn;
         __u64 count;
         __u64 skeydata_addr;
         __u32 flags;
         __u32 reserved[9];
-};
+  };
  
  The start_gfn field is the number of the first guest frame whose storage keys
  you want to get.
@@ -3168,12 +3489,13 @@ The skeydata_addr field is the address to a buffer large enough to hold count
  bytes. This buffer will be filled with storage key data by the ioctl.
  
  4.91 KVM_S390_SET_SKEYS
+-----------------------
  
-Capability: KVM_CAP_S390_SKEYS
-Architectures: s390
-Type: vm ioctl
-Parameters: struct kvm_s390_skeys
-Returns: 0 on success, negative value on error
+:Capability: KVM_CAP_S390_SKEYS
+:Architectures: s390
+:Type: vm ioctl
+:Parameters: struct kvm_s390_skeys
+:Returns: 0 on success, negative value on error
  
  This ioctl is used to set guest storage key values on the s390
  architecture. The ioctl takes parameters via the kvm_s390_skeys struct.
@@ -3195,21 +3517,27 @@ Note: If any architecturally invalid key value is found in the given data then
  the ioctl will return -EINVAL.
  
  4.92 KVM_S390_IRQ
+-----------------
+
+:Capability: KVM_CAP_S390_INJECT_IRQ
+:Architectures: s390
+:Type: vcpu ioctl
+:Parameters: struct kvm_s390_irq (in)
+:Returns: 0 on success, -1 on error
  
-Capability: KVM_CAP_S390_INJECT_IRQ
-Architectures: s390
-Type: vcpu ioctl
-Parameters: struct kvm_s390_irq (in)
-Returns: 0 on success, -1 on error
  Errors:
-  EINVAL: interrupt type is invalid
-          type is KVM_S390_SIGP_STOP and flag parameter is invalid value
+
+
+  ======  =================================================================
+  EINVAL  interrupt type is invalid
+          type is KVM_S390_SIGP_STOP and flag parameter is invalid value,
            type is KVM_S390_INT_EXTERNAL_CALL and code is bigger
-            than the maximum of VCPUs
-  EBUSY:  type is KVM_S390_SIGP_SET_PREFIX and vcpu is not stopped
-          type is KVM_S390_SIGP_STOP and a stop irq is already pending
+          than the maximum of VCPUs
+  EBUSY   type is KVM_S390_SIGP_SET_PREFIX and vcpu is not stopped,
+          type is KVM_S390_SIGP_STOP and a stop irq is already pending,
            type is KVM_S390_INT_EXTERNAL_CALL and an external call interrupt
-            is already pending
+          is already pending
+  ======  =================================================================
  
  Allows to inject an interrupt to the guest.
  
@@ -3217,9 +3545,9 @@ Using struct kvm_s390_irq as a parameter allows
  to inject additional payload which is not
  possible via KVM_S390_INTERRUPT.
  
-Interrupt parameters are passed via kvm_s390_irq:
+Interrupt parameters are passed via kvm_s390_irq::
  
-struct kvm_s390_irq {
+  struct kvm_s390_irq {
         __u64 type;
         union {
                 struct kvm_s390_io_info io;
@@ -3232,44 +3560,45 @@ struct kvm_s390_irq {
                 struct kvm_s390_mchk_info mchk;
                 char reserved[64];
         } u;
-};
+  };
  
  type can be one of the following:
  
-KVM_S390_SIGP_STOP - sigp stop; parameter in .stop
-KVM_S390_PROGRAM_INT - program check; parameters in .pgm
-KVM_S390_SIGP_SET_PREFIX - sigp set prefix; parameters in .prefix
-KVM_S390_RESTART - restart; no parameters
-KVM_S390_INT_CLOCK_COMP - clock comparator interrupt; no parameters
-KVM_S390_INT_CPU_TIMER - CPU timer interrupt; no parameters
-KVM_S390_INT_EMERGENCY - sigp emergency; parameters in .emerg
-KVM_S390_INT_EXTERNAL_CALL - sigp external call; parameters in .extcall
-KVM_S390_MCHK - machine check interrupt; parameters in .mchk
+- KVM_S390_SIGP_STOP - sigp stop; parameter in .stop
+- KVM_S390_PROGRAM_INT - program check; parameters in .pgm
+- KVM_S390_SIGP_SET_PREFIX - sigp set prefix; parameters in .prefix
+- KVM_S390_RESTART - restart; no parameters
+- KVM_S390_INT_CLOCK_COMP - clock comparator interrupt; no parameters
+- KVM_S390_INT_CPU_TIMER - CPU timer interrupt; no parameters
+- KVM_S390_INT_EMERGENCY - sigp emergency; parameters in .emerg
+- KVM_S390_INT_EXTERNAL_CALL - sigp external call; parameters in .extcall
+- KVM_S390_MCHK - machine check interrupt; parameters in .mchk
  
  This is an asynchronous vcpu ioctl and can be invoked from any thread.
  
  4.94 KVM_S390_GET_IRQ_STATE
+---------------------------
  
-Capability: KVM_CAP_S390_IRQ_STATE
-Architectures: s390
-Type: vcpu ioctl
-Parameters: struct kvm_s390_irq_state (out)
-Returns: >= number of bytes copied into buffer,
-         -EINVAL if buffer size is 0,
-         -ENOBUFS if buffer size is too small to fit all pending interrupts,
-         -EFAULT if the buffer address was invalid
+:Capability: KVM_CAP_S390_IRQ_STATE
+:Architectures: s390
+:Type: vcpu ioctl
+:Parameters: struct kvm_s390_irq_state (out)
+:Returns: >= number of bytes copied into buffer,
+          -EINVAL if buffer size is 0,
+          -ENOBUFS if buffer size is too small to fit all pending interrupts,
+          -EFAULT if the buffer address was invalid
  
  This ioctl allows userspace to retrieve the complete state of all currently
  pending interrupts in a single buffer. Use cases include migration
  and introspection. The parameter structure contains the address of a
-userspace buffer and its length:
+userspace buffer and its length::
  
-struct kvm_s390_irq_state {
+  struct kvm_s390_irq_state {
         __u64 buf;
         __u32 flags;        /* will stay unused for compatibility reasons */
         __u32 len;
         __u32 reserved[4];  /* will stay unused for compatibility reasons */
-};
+  };
  
  Userspace passes in the above struct and for each pending interrupt a
  struct kvm_s390_irq is copied to the provided buffer.
@@ -3283,29 +3612,30 @@ If -ENOBUFS is returned the buffer provided was too small and userspace
  may retry with a bigger buffer.
  
  4.95 KVM_S390_SET_IRQ_STATE
-
-Capability: KVM_CAP_S390_IRQ_STATE
-Architectures: s390
-Type: vcpu ioctl
-Parameters: struct kvm_s390_irq_state (in)
-Returns: 0 on success,
-         -EFAULT if the buffer address was invalid,
-         -EINVAL for an invalid buffer length (see below),
-         -EBUSY if there were already interrupts pending,
-         errors occurring when actually injecting the
+---------------------------
+
+:Capability: KVM_CAP_S390_IRQ_STATE
+:Architectures: s390
+:Type: vcpu ioctl
+:Parameters: struct kvm_s390_irq_state (in)
+:Returns: 0 on success,
+          -EFAULT if the buffer address was invalid,
+          -EINVAL for an invalid buffer length (see below),
+          -EBUSY if there were already interrupts pending,
+          errors occurring when actually injecting the
            interrupt. See KVM_S390_IRQ.
  
  This ioctl allows userspace to set the complete state of all cpu-local
  interrupts currently pending for the vcpu. It is intended for restoring
  interrupt state after a migration. The input parameter is a userspace buffer
-containing a struct kvm_s390_irq_state:
+containing a struct kvm_s390_irq_state::
  
-struct kvm_s390_irq_state {
+  struct kvm_s390_irq_state {
         __u64 buf;
         __u32 flags;        /* will stay unused for compatibility reasons */
         __u32 len;
         __u32 reserved[4];  /* will stay unused for compatibility reasons */
-};
+  };
  
  The restrictions for flags and reserved apply as well.
  (see KVM_S390_GET_IRQ_STATE)
@@ -3320,20 +3650,22 @@ and it must not exceed (max_vcpus + 32) * sizeof(struct kvm_s390_irq),
  which is the maximum number of possibly pending cpu-local interrupts.
  
  4.96 KVM_SMI
+------------
  
-Capability: KVM_CAP_X86_SMM
-Architectures: x86
-Type: vcpu ioctl
-Parameters: none
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_X86_SMM
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: none
+:Returns: 0 on success, -1 on error
  
  Queues an SMI on the thread's vcpu.
  
  4.97 KVM_CAP_PPC_MULTITCE
+-------------------------
  
-Capability: KVM_CAP_PPC_MULTITCE
-Architectures: ppc
-Type: vm
+:Capability: KVM_CAP_PPC_MULTITCE
+:Architectures: ppc
+:Type: vm
  
  This capability means the kernel is capable of handling hypercalls
  H_PUT_TCE_INDIRECT and H_STUFF_TCE without passing those into the user
@@ -3355,26 +3687,27 @@ an implementation for these despite the in kernel acceleration.
  This capability is always enabled.
  
  4.98 KVM_CREATE_SPAPR_TCE_64
+----------------------------
  
-Capability: KVM_CAP_SPAPR_TCE_64
-Architectures: powerpc
-Type: vm ioctl
-Parameters: struct kvm_create_spapr_tce_64 (in)
-Returns: file descriptor for manipulating the created TCE table
+:Capability: KVM_CAP_SPAPR_TCE_64
+:Architectures: powerpc
+:Type: vm ioctl
+:Parameters: struct kvm_create_spapr_tce_64 (in)
+:Returns: file descriptor for manipulating the created TCE table
  
  This is an extension for KVM_CAP_SPAPR_TCE which only supports 32bit
  windows, described in 4.62 KVM_CREATE_SPAPR_TCE
  
-This capability uses extended struct in ioctl interface:
+This capability uses extended struct in ioctl interface::
  
-/* for KVM_CAP_SPAPR_TCE_64 */
-struct kvm_create_spapr_tce_64 {
+  /* for KVM_CAP_SPAPR_TCE_64 */
+  struct kvm_create_spapr_tce_64 {
         __u64 liobn;
         __u32 page_shift;
         __u32 flags;
         __u64 offset;   /* in pages */
         __u64 size;     /* in pages */
-};
+  };
  
  The aim of extension is to support an additional bigger DMA window with
  a variable page size.
@@ -3387,12 +3720,13 @@ of IOMMU pages.
  The rest of functionality is identical to KVM_CREATE_SPAPR_TCE.
  
  4.99 KVM_REINJECT_CONTROL
+-------------------------
  
-Capability: KVM_CAP_REINJECT_CONTROL
-Architectures: x86
-Type: vm ioctl
-Parameters: struct kvm_reinject_control (in)
-Returns: 0 on success,
+:Capability: KVM_CAP_REINJECT_CONTROL
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: struct kvm_reinject_control (in)
+:Returns: 0 on success,
           -EFAULT if struct kvm_reinject_control cannot be read,
           -ENXIO if KVM_CREATE_PIT or KVM_CREATE_PIT2 didn't succeed earlier.
  
@@ -3402,21 +3736,24 @@ vector(s) that i8254 injects.  Reinject mode dequeues a tick and injects its
  interrupt whenever there isn't a pending interrupt from i8254.
  !reinject mode injects an interrupt as soon as a tick arrives.
  
-struct kvm_reinject_control {
+::
+
+  struct kvm_reinject_control {
         __u8 pit_reinject;
         __u8 reserved[31];
-};
+  };
  
  pit_reinject = 0 (!reinject mode) is recommended, unless running an old
  operating system that uses the PIT for timing (e.g. Linux 2.4.x).
  
  4.100 KVM_PPC_CONFIGURE_V3_MMU
+------------------------------
  
-Capability: KVM_CAP_PPC_RADIX_MMU or KVM_CAP_PPC_HASH_MMU_V3
-Architectures: ppc
-Type: vm ioctl
-Parameters: struct kvm_ppc_mmuv3_cfg (in)
-Returns: 0 on success,
+:Capability: KVM_CAP_PPC_RADIX_MMU or KVM_CAP_PPC_HASH_MMU_V3
+:Architectures: ppc
+:Type: vm ioctl
+:Parameters: struct kvm_ppc_mmuv3_cfg (in)
+:Returns: 0 on success,
           -EFAULT if struct kvm_ppc_mmuv3_cfg cannot be read,
           -EINVAL if the configuration is invalid
  
@@ -3424,10 +3761,12 @@ This ioctl controls whether the guest will use radix or HPT (hashed
  page table) translation, and sets the pointer to the process table for
  the guest.
  
-struct kvm_ppc_mmuv3_cfg {
+::
+
+  struct kvm_ppc_mmuv3_cfg {
         __u64   flags;
         __u64   process_table;
-};
+  };
  
  There are two bits that can be set in flags; KVM_PPC_MMUV3_RADIX and
  KVM_PPC_MMUV3_GTSE.  KVM_PPC_MMUV3_RADIX, if set, configures the guest
@@ -3442,12 +3781,13 @@ as the second doubleword of the partition table entry, as defined in
  the Power ISA V3.00, Book III section 5.7.6.1.
  
  4.101 KVM_PPC_GET_RMMU_INFO
+---------------------------
  
-Capability: KVM_CAP_PPC_RADIX_MMU
-Architectures: ppc
-Type: vm ioctl
-Parameters: struct kvm_ppc_rmmu_info (out)
-Returns: 0 on success,
+:Capability: KVM_CAP_PPC_RADIX_MMU
+:Architectures: ppc
+:Type: vm ioctl
+:Parameters: struct kvm_ppc_rmmu_info (out)
+:Returns: 0 on success,
          -EFAULT if struct kvm_ppc_rmmu_info cannot be written,
          -EINVAL if no useful information can be returned
  
@@ -3456,14 +3796,16 @@ containing supported radix tree geometries, and (b) a list that maps
  page sizes to put in the "AP" (actual page size) field for the tlbie
  (TLB invalidate entry) instruction.
  
-struct kvm_ppc_rmmu_info {
+::
+
+  struct kvm_ppc_rmmu_info {
         struct kvm_ppc_radix_geom {
                 __u8    page_shift;
                 __u8    level_bits[4];
                 __u8    pad[3];
         }       geometries[8];
         __u32   ap_encodings[8];
-};
+  };
  
  The geometries[] field gives up to 8 supported geometries for the
  radix page table, in terms of the log base 2 of the smallest page
@@ -3476,19 +3818,54 @@ encodings, encoded with the AP value in the top 3 bits and the log
  base 2 of the page size in the bottom 6 bits.
  
  4.102 KVM_PPC_RESIZE_HPT_PREPARE
+--------------------------------
  
-Capability: KVM_CAP_SPAPR_RESIZE_HPT
-Architectures: powerpc
-Type: vm ioctl
-Parameters: struct kvm_ppc_resize_hpt (in)
-Returns: 0 on successful completion,
+:Capability: KVM_CAP_SPAPR_RESIZE_HPT
+:Architectures: powerpc
+:Type: vm ioctl
+:Parameters: struct kvm_ppc_resize_hpt (in)
+:Returns: 0 on successful completion,
          >0 if a new HPT is being prepared, the value is an estimated
-             number of milliseconds until preparation is complete
+         number of milliseconds until preparation is complete,
           -EFAULT if struct kvm_reinject_control cannot be read,
-        -EINVAL if the supplied shift or flags are invalid
-        -ENOMEM if unable to allocate the new HPT
-        -ENOSPC if there was a hash collision when moving existing
-                  HPT entries to the new HPT
+        -EINVAL if the supplied shift or flags are invalid,
+        -ENOMEM if unable to allocate the new HPT,
+        -ENOSPC if there was a hash collision
+
+::
+
+  struct kvm_ppc_rmmu_info {
+       struct kvm_ppc_radix_geom {
+               __u8    page_shift;
+               __u8    level_bits[4];
+               __u8    pad[3];
+       }       geometries[8];
+       __u32   ap_encodings[8];
+  };
+
+The geometries[] field gives up to 8 supported geometries for the
+radix page table, in terms of the log base 2 of the smallest page
+size, and the number of bits indexed at each level of the tree, from
+the PTE level up to the PGD level in that order.  Any unused entries
+will have 0 in the page_shift field.
+
+The ap_encodings gives the supported page sizes and their AP field
+encodings, encoded with the AP value in the top 3 bits and the log
+base 2 of the page size in the bottom 6 bits.
+
+4.102 KVM_PPC_RESIZE_HPT_PREPARE
+--------------------------------
+
+:Capability: KVM_CAP_SPAPR_RESIZE_HPT
+:Architectures: powerpc
+:Type: vm ioctl
+:Parameters: struct kvm_ppc_resize_hpt (in)
+:Returns: 0 on successful completion,
+        >0 if a new HPT is being prepared, the value is an estimated
+         number of milliseconds until preparation is complete,
+         -EFAULT if struct kvm_reinject_control cannot be read,
+        -EINVAL if the supplied shift or flags are invalid,when moving existing
+         HPT entries to the new HPT,
          -EIO on other error conditions
  
  Used to implement the PAPR extension for runtime resizing of a guest's
@@ -3506,6 +3883,7 @@ requested in the parameters, discards the existing pending HPT and
  creates a new one as above.
  
  If called when there is a pending HPT of the size requested, will:
+
    * If preparation of the pending HPT is already complete, return 0
    * If preparation of the pending HPT has failed, return an error
      code, then discard the pending HPT.
@@ -3522,26 +3900,29 @@ Normally this will be called repeatedly with the same parameters until
  it returns <= 0.  The first call will initiate preparation, subsequent
  ones will monitor preparation until it completes or fails.
  
-struct kvm_ppc_resize_hpt {
+::
+
+  struct kvm_ppc_resize_hpt {
         __u64 flags;
         __u32 shift;
         __u32 pad;
-};
+  };
  
  4.103 KVM_PPC_RESIZE_HPT_COMMIT
+-------------------------------
  
-Capability: KVM_CAP_SPAPR_RESIZE_HPT
-Architectures: powerpc
-Type: vm ioctl
-Parameters: struct kvm_ppc_resize_hpt (in)
-Returns: 0 on successful completion,
+:Capability: KVM_CAP_SPAPR_RESIZE_HPT
+:Architectures: powerpc
+:Type: vm ioctl
+:Parameters: struct kvm_ppc_resize_hpt (in)
+:Returns: 0 on successful completion,
           -EFAULT if struct kvm_reinject_control cannot be read,
-        -EINVAL if the supplied shift or flags are invalid
+        -EINVAL if the supplied shift or flags are invalid,
          -ENXIO is there is no pending HPT, or the pending HPT doesn't
-                 have the requested size
-        -EBUSY if the pending HPT is not fully prepared
+         have the requested size,
+        -EBUSY if the pending HPT is not fully prepared,
          -ENOSPC if there was a hash collision when moving existing
-                  HPT entries to the new HPT
+         HPT entries to the new HPT,
          -EIO on other error conditions
  
  Used to implement the PAPR extension for runtime resizing of a guest's
@@ -3564,31 +3945,35 @@ HPT and the previous HPT will be discarded.
  
  On failure, the guest will still be operating on its previous HPT.
  
-struct kvm_ppc_resize_hpt {
+::
+
+  struct kvm_ppc_resize_hpt {
         __u64 flags;
         __u32 shift;
         __u32 pad;
-};
+  };
  
  4.104 KVM_X86_GET_MCE_CAP_SUPPORTED
+-----------------------------------
  
-Capability: KVM_CAP_MCE
-Architectures: x86
-Type: system ioctl
-Parameters: u64 mce_cap (out)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_MCE
+:Architectures: x86
+:Type: system ioctl
+:Parameters: u64 mce_cap (out)
+:Returns: 0 on success, -1 on error
  
  Returns supported MCE capabilities. The u64 mce_cap parameter
  has the same format as the MSR_IA32_MCG_CAP register. Supported
  capabilities will have the corresponding bits set.
  
  4.105 KVM_X86_SETUP_MCE
+-----------------------
  
-Capability: KVM_CAP_MCE
-Architectures: x86
-Type: vcpu ioctl
-Parameters: u64 mcg_cap (in)
-Returns: 0 on success,
+:Capability: KVM_CAP_MCE
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: u64 mcg_cap (in)
+:Returns: 0 on success,
           -EFAULT if u64 mcg_cap cannot be read,
           -EINVAL if the requested number of banks is invalid,
           -EINVAL if requested MCE capability is not supported.
@@ -3601,20 +3986,21 @@ checking for KVM_CAP_MCE. The supported capabilities can be
  retrieved with KVM_X86_GET_MCE_CAP_SUPPORTED.
  
  4.106 KVM_X86_SET_MCE
+---------------------
  
-Capability: KVM_CAP_MCE
-Architectures: x86
-Type: vcpu ioctl
-Parameters: struct kvm_x86_mce (in)
-Returns: 0 on success,
+:Capability: KVM_CAP_MCE
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_x86_mce (in)
+:Returns: 0 on success,
           -EFAULT if struct kvm_x86_mce cannot be read,
           -EINVAL if the bank number is invalid,
           -EINVAL if VAL bit is not set in status field.
  
  Inject a machine check error (MCE) into the guest. The input
-parameter is:
+parameter is::
  
-struct kvm_x86_mce {
+  struct kvm_x86_mce {
         __u64 status;
         __u64 addr;
         __u64 misc;
@@ -3622,7 +4008,7 @@ struct kvm_x86_mce {
         __u8 bank;
         __u8 pad1[7];
         __u64 pad2[3];
-};
+  };
  
  If the MCE being reported is an uncorrected error, KVM will
  inject it as an MCE exception into the guest. If the guest
@@ -3634,15 +4020,17 @@ store it in the corresponding bank (provided this bank is
  not holding a previously reported uncorrected error).
  
  4.107 KVM_S390_GET_CMMA_BITS
+----------------------------
  
-Capability: KVM_CAP_S390_CMMA_MIGRATION
-Architectures: s390
-Type: vm ioctl
-Parameters: struct kvm_s390_cmma_log (in, out)
-Returns: 0 on success, a negative value on error
+:Capability: KVM_CAP_S390_CMMA_MIGRATION
+:Architectures: s390
+:Type: vm ioctl
+:Parameters: struct kvm_s390_cmma_log (in, out)
+:Returns: 0 on success, a negative value on error
  
  This ioctl is used to get the values of the CMMA bits on the s390
  architecture. It is meant to be used in two scenarios:
+
  - During live migration to save the CMMA values. Live migration needs
    to be enabled via the KVM_REQ_START_MIGRATION VM property.
  - To non-destructively peek at the CMMA values, with the flag
@@ -3652,9 +4040,12 @@ The ioctl takes parameters via the kvm_s390_cmma_log struct. The desired
  values are written to a buffer whose location is indicated via the "values"
  member in the kvm_s390_cmma_log struct.  The values in the input struct are
  also updated as needed.
+
  Each CMMA value takes up one byte.
  
-struct kvm_s390_cmma_log {
+::
+
+  struct kvm_s390_cmma_log {
         __u64 start_gfn;
         __u32 count;
         __u32 flags;
@@ -3663,7 +4054,7 @@ struct kvm_s390_cmma_log {
                 __u64 mask;
         };
         __u64 values;
-};
+  };
  
  start_gfn is the number of the first guest frame whose CMMA values are
  to be retrieved,
@@ -3724,12 +4115,13 @@ KVM_S390_CMMA_PEEK is not set but migration mode was not enabled, with
  present for the addresses (e.g. when using hugepages).
  
  4.108 KVM_S390_SET_CMMA_BITS
+----------------------------
  
-Capability: KVM_CAP_S390_CMMA_MIGRATION
-Architectures: s390
-Type: vm ioctl
-Parameters: struct kvm_s390_cmma_log (in)
-Returns: 0 on success, a negative value on error
+:Capability: KVM_CAP_S390_CMMA_MIGRATION
+:Architectures: s390
+:Type: vm ioctl
+:Parameters: struct kvm_s390_cmma_log (in)
+:Returns: 0 on success, a negative value on error
  
  This ioctl is used to set the values of the CMMA bits on the s390
  architecture. It is meant to be used during live migration to restore
@@ -3737,16 +4129,18 @@ the CMMA values, but there are no restrictions on its use.
  The ioctl takes parameters via the kvm_s390_cmma_values struct.
  Each CMMA value takes up one byte.
  
-struct kvm_s390_cmma_log {
+::
+
+  struct kvm_s390_cmma_log {
         __u64 start_gfn;
         __u32 count;
         __u32 flags;
         union {
                 __u64 remaining;
                 __u64 mask;
-       };
+       };
         __u64 values;
-};
+  };
  
  start_gfn indicates the starting guest frame number,
  
@@ -3769,26 +4163,27 @@ or if no page table is present for the addresses (e.g. when using
  hugepages).
  
  4.109 KVM_PPC_GET_CPU_CHAR
+--------------------------
  
-Capability: KVM_CAP_PPC_GET_CPU_CHAR
-Architectures: powerpc
-Type: vm ioctl
-Parameters: struct kvm_ppc_cpu_char (out)
-Returns: 0 on successful completion
+:Capability: KVM_CAP_PPC_GET_CPU_CHAR
+:Architectures: powerpc
+:Type: vm ioctl
+:Parameters: struct kvm_ppc_cpu_char (out)
+:Returns: 0 on successful completion,
          -EFAULT if struct kvm_ppc_cpu_char cannot be written
  
  This ioctl gives userspace information about certain characteristics
  of the CPU relating to speculative execution of instructions and
  possible information leakage resulting from speculative execution (see
  CVE-2017-5715, CVE-2017-5753 and CVE-2017-5754).  The information is
-returned in struct kvm_ppc_cpu_char, which looks like this:
+returned in struct kvm_ppc_cpu_char, which looks like this::
  
-struct kvm_ppc_cpu_char {
+  struct kvm_ppc_cpu_char {
         __u64   character;              /* characteristics of the CPU */
         __u64   behaviour;              /* recommended software behaviour */
         __u64   character_mask;         /* valid bits in character */
         __u64   behaviour_mask;         /* valid bits in behaviour */
-};
+  };
  
  For extensibility, the character_mask and behaviour_mask fields
  indicate which bits of character and behaviour have been filled in by
@@ -3815,12 +4210,13 @@ These fields use the same bit definitions as the new
  H_GET_CPU_CHARACTERISTICS hypercall.
  
  4.110 KVM_MEMORY_ENCRYPT_OP
+---------------------------
  
-Capability: basic
-Architectures: x86
-Type: system
-Parameters: an opaque platform specific structure (in/out)
-Returns: 0 on success; -1 on error
+:Capability: basic
+:Architectures: x86
+:Type: system
+:Parameters: an opaque platform specific structure (in/out)
+:Returns: 0 on success; -1 on error
  
  If the platform supports creating encrypted VMs then this ioctl can be used
  for issuing platform-specific memory encryption commands to manage those
@@ -3831,12 +4227,13 @@ Currently, this ioctl is used for issuing Secure Encrypted Virtualization
  Documentation/virt/kvm/amd-memory-encryption.rst.
  
  4.111 KVM_MEMORY_ENCRYPT_REG_REGION
+-----------------------------------
  
-Capability: basic
-Architectures: x86
-Type: system
-Parameters: struct kvm_enc_region (in)
-Returns: 0 on success; -1 on error
+:Capability: basic
+:Architectures: x86
+:Type: system
+:Parameters: struct kvm_enc_region (in)
+:Returns: 0 on success; -1 on error
  
  This ioctl can be used to register a guest memory region which may
  contain encrypted data (e.g. guest RAM, SMRAM etc).
@@ -3854,60 +4251,71 @@ swap or migrate (move) ciphertext pages. Hence, for now we pin the guest
  memory region registered with the ioctl.
  
  4.112 KVM_MEMORY_ENCRYPT_UNREG_REGION
+-------------------------------------
  
-Capability: basic
-Architectures: x86
-Type: system
-Parameters: struct kvm_enc_region (in)
-Returns: 0 on success; -1 on error
+:Capability: basic
+:Architectures: x86
+:Type: system
+:Parameters: struct kvm_enc_region (in)
+:Returns: 0 on success; -1 on error
  
  This ioctl can be used to unregister the guest memory region registered
  with KVM_MEMORY_ENCRYPT_REG_REGION ioctl above.
  
  4.113 KVM_HYPERV_EVENTFD
+------------------------
  
-Capability: KVM_CAP_HYPERV_EVENTFD
-Architectures: x86
-Type: vm ioctl
-Parameters: struct kvm_hyperv_eventfd (in)
+:Capability: KVM_CAP_HYPERV_EVENTFD
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: struct kvm_hyperv_eventfd (in)
  
  This ioctl (un)registers an eventfd to receive notifications from the guest on
  the specified Hyper-V connection id through the SIGNAL_EVENT hypercall, without
  causing a user exit.  SIGNAL_EVENT hypercall with non-zero event flag number
  (bits 24-31) still triggers a KVM_EXIT_HYPERV_HCALL user exit.
  
-struct kvm_hyperv_eventfd {
+::
+
+  struct kvm_hyperv_eventfd {
         __u32 conn_id;
         __s32 fd;
         __u32 flags;
         __u32 padding[3];
-};
+  };
  
-The conn_id field should fit within 24 bits:
+The conn_id field should fit within 24 bits::
  
-#define KVM_HYPERV_CONN_ID_MASK                0x00ffffff
+  #define KVM_HYPERV_CONN_ID_MASK              0x00ffffff
  
-The acceptable values for the flags field are:
+The acceptable values for the flags field are::
  
-#define KVM_HYPERV_EVENTFD_DEASSIGN    (1 << 0)
+  #define KVM_HYPERV_EVENTFD_DEASSIGN  (1 << 0)
  
-Returns: 0 on success,
-       -EINVAL if conn_id or flags is outside the allowed range
-       -ENOENT on deassign if the conn_id isn't registered
-       -EEXIST on assign if the conn_id is already registered
+:Returns: 0 on success,
+         -EINVAL if conn_id or flags is outside the allowed range,
+         -ENOENT on deassign if the conn_id isn't registered,
+         -EEXIST on assign if the conn_id is already registered
  
  4.114 KVM_GET_NESTED_STATE
+--------------------------
+
+:Capability: KVM_CAP_NESTED_STATE
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_nested_state (in/out)
+:Returns: 0 on success, -1 on error
  
-Capability: KVM_CAP_NESTED_STATE
-Architectures: x86
-Type: vcpu ioctl
-Parameters: struct kvm_nested_state (in/out)
-Returns: 0 on success, -1 on error
  Errors:
-  E2BIG:     the total state size exceeds the value of 'size' specified by
+
+  =====      =============================================================
+  E2BIG      the total state size exceeds the value of 'size' specified by
               the user; the size required will be written into size.
+  =====      =============================================================
+
+::
  
-struct kvm_nested_state {
+  struct kvm_nested_state {
         __u16 flags;
         __u16 format;
         __u32 size;
@@ -3924,33 +4332,33 @@ struct kvm_nested_state {
                 struct kvm_vmx_nested_state_data vmx[0];
                 struct kvm_svm_nested_state_data svm[0];
         } data;
-};
+  };
  
-#define KVM_STATE_NESTED_GUEST_MODE    0x00000001
-#define KVM_STATE_NESTED_RUN_PENDING   0x00000002
-#define KVM_STATE_NESTED_EVMCS         0x00000004
+  #define KVM_STATE_NESTED_GUEST_MODE          0x00000001
+  #define KVM_STATE_NESTED_RUN_PENDING         0x00000002
+  #define KVM_STATE_NESTED_EVMCS               0x00000004
  
-#define KVM_STATE_NESTED_FORMAT_VMX            0
-#define KVM_STATE_NESTED_FORMAT_SVM            1
+  #define KVM_STATE_NESTED_FORMAT_VMX          0
+  #define KVM_STATE_NESTED_FORMAT_SVM          1
  
-#define KVM_STATE_NESTED_VMX_VMCS_SIZE         0x1000
+  #define KVM_STATE_NESTED_VMX_VMCS_SIZE       0x1000
  
-#define KVM_STATE_NESTED_VMX_SMM_GUEST_MODE    0x00000001
-#define KVM_STATE_NESTED_VMX_SMM_VMXON         0x00000002
+  #define KVM_STATE_NESTED_VMX_SMM_GUEST_MODE  0x00000001
+  #define KVM_STATE_NESTED_VMX_SMM_VMXON       0x00000002
  
-struct kvm_vmx_nested_state_hdr {
+  struct kvm_vmx_nested_state_hdr {
         __u64 vmxon_pa;
         __u64 vmcs12_pa;
  
         struct {
                 __u16 flags;
         } smm;
-};
+  };
  
-struct kvm_vmx_nested_state_data {
+  struct kvm_vmx_nested_state_data {
         __u8 vmcs12[KVM_STATE_NESTED_VMX_VMCS_SIZE];
         __u8 shadow_vmcs12[KVM_STATE_NESTED_VMX_VMCS_SIZE];
-};
+  };
  
  This ioctl copies the vcpu's nested virtualization state from the kernel to
  userspace.
@@ -3959,24 +4367,26 @@ The maximum size of the state can be retrieved by passing KVM_CAP_NESTED_STATE
  to the KVM_CHECK_EXTENSION ioctl().
  
  4.115 KVM_SET_NESTED_STATE
+--------------------------
  
-Capability: KVM_CAP_NESTED_STATE
-Architectures: x86
-Type: vcpu ioctl
-Parameters: struct kvm_nested_state (in)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_NESTED_STATE
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_nested_state (in)
+:Returns: 0 on success, -1 on error
  
  This copies the vcpu's kvm_nested_state struct from userspace to the kernel.
  For the definition of struct kvm_nested_state, see KVM_GET_NESTED_STATE.
  
  4.116 KVM_(UN)REGISTER_COALESCED_MMIO
+-------------------------------------
  
-Capability: KVM_CAP_COALESCED_MMIO (for coalesced mmio)
-           KVM_CAP_COALESCED_PIO (for coalesced pio)
-Architectures: all
-Type: vm ioctl
-Parameters: struct kvm_coalesced_mmio_zone
-Returns: 0 on success, < 0 on error
+:Capability: KVM_CAP_COALESCED_MMIO (for coalesced mmio)
+            KVM_CAP_COALESCED_PIO (for coalesced pio)
+:Architectures: all
+:Type: vm ioctl
+:Parameters: struct kvm_coalesced_mmio_zone
+:Returns: 0 on success, < 0 on error
  
  Coalesced I/O is a performance optimization that defers hardware
  register write emulation so that userspace exits are avoided.  It is
@@ -3998,15 +4408,18 @@ between coalesced mmio and pio except that coalesced pio records accesses
  to I/O ports.
  
  4.117 KVM_CLEAR_DIRTY_LOG (vm ioctl)
+------------------------------------
  
-Capability: KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2
-Architectures: x86, arm, arm64, mips
-Type: vm ioctl
-Parameters: struct kvm_dirty_log (in)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2
+:Architectures: x86, arm, arm64, mips
+:Type: vm ioctl
+:Parameters: struct kvm_dirty_log (in)
+:Returns: 0 on success, -1 on error
  
-/* for KVM_CLEAR_DIRTY_LOG */
-struct kvm_clear_dirty_log {
+::
+
+  /* for KVM_CLEAR_DIRTY_LOG */
+  struct kvm_clear_dirty_log {
         __u32 slot;
         __u32 num_pages;
         __u64 first_page;
@@ -4014,7 +4427,7 @@ struct kvm_clear_dirty_log {
                 void __user *dirty_bitmap; /* one bit per page */
                 __u64 padding;
         };
-};
+  };
  
  The ioctl clears the dirty status of pages in a memory slot, according to
  the bitmap that is passed in struct kvm_clear_dirty_log's dirty_bitmap
@@ -4038,20 +4451,23 @@ However, it can always be used as long as KVM_CHECK_EXTENSION confirms
  that KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 is present.
  
  4.118 KVM_GET_SUPPORTED_HV_CPUID
+--------------------------------
+
+:Capability: KVM_CAP_HYPERV_CPUID
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_cpuid2 (in/out)
+:Returns: 0 on success, -1 on error
  
-Capability: KVM_CAP_HYPERV_CPUID
-Architectures: x86
-Type: vcpu ioctl
-Parameters: struct kvm_cpuid2 (in/out)
-Returns: 0 on success, -1 on error
+::
  
-struct kvm_cpuid2 {
+  struct kvm_cpuid2 {
         __u32 nent;
         __u32 padding;
         struct kvm_cpuid_entry2 entries[0];
-};
+  };
  
-struct kvm_cpuid_entry2 {
+  struct kvm_cpuid_entry2 {
         __u32 function;
         __u32 index;
         __u32 flags;
@@ -4060,7 +4476,7 @@ struct kvm_cpuid_entry2 {
         __u32 ecx;
         __u32 edx;
         __u32 padding[3];
-};
+  };
  
  This ioctl returns x86 cpuid features leaves related to Hyper-V emulation in
  KVM.  Userspace can use the information returned by this ioctl to construct
@@ -4073,13 +4489,13 @@ KVM_GET_SUPPORTED_CPUID ioctl because some of them intersect with KVM feature
  leaves (0x40000000, 0x40000001).
  
  Currently, the following list of CPUID leaves are returned:
- HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS
- HYPERV_CPUID_INTERFACE
- HYPERV_CPUID_VERSION
- HYPERV_CPUID_FEATURES
- HYPERV_CPUID_ENLIGHTMENT_INFO
- HYPERV_CPUID_IMPLEMENT_LIMITS
- HYPERV_CPUID_NESTED_FEATURES
+ - HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS
+ - HYPERV_CPUID_INTERFACE
+ - HYPERV_CPUID_VERSION
+ - HYPERV_CPUID_FEATURES
+ - HYPERV_CPUID_ENLIGHTMENT_INFO
+ - HYPERV_CPUID_IMPLEMENT_LIMITS
+ - HYPERV_CPUID_NESTED_FEATURES
  
  HYPERV_CPUID_NESTED_FEATURES leaf is only exposed when Enlightened VMCS was
  enabled on the corresponding vCPU (KVM_CAP_HYPERV_ENLIGHTENED_VMCS).
@@ -4095,17 +4511,25 @@ number of valid entries in the 'entries' array, which is then filled.
  userspace should not expect to get any particular value there.
  
  4.119 KVM_ARM_VCPU_FINALIZE
+---------------------------
+
+:Architectures: arm, arm64
+:Type: vcpu ioctl
+:Parameters: int feature (in)
+:Returns: 0 on success, -1 on error
  
-Architectures: arm, arm64
-Type: vcpu ioctl
-Parameters: int feature (in)
-Returns: 0 on success, -1 on error
  Errors:
-  EPERM:     feature not enabled, needs configuration, or already finalized
-  EINVAL:    feature unknown or not present
+
+  ======     ==============================================================
+  EPERM      feature not enabled, needs configuration, or already finalized
+  EINVAL     feature unknown or not present
+  ======     ==============================================================
  
  Recognised values for feature:
+
+  =====      ===========================================
    arm64      KVM_ARM_VCPU_SVE (requires KVM_CAP_ARM_SVE)
+  =====      ===========================================
  
  Finalizes the configuration of the specified vcpu feature.
  
@@ -4129,21 +4553,24 @@ See KVM_ARM_VCPU_INIT for details of vcpu features that require finalization
  using this ioctl.
  
  4.120 KVM_SET_PMU_EVENT_FILTER
+------------------------------
  
-Capability: KVM_CAP_PMU_EVENT_FILTER
-Architectures: x86
-Type: vm ioctl
-Parameters: struct kvm_pmu_event_filter (in)
-Returns: 0 on success, -1 on error
+:Capability: KVM_CAP_PMU_EVENT_FILTER
+:Architectures: x86
+:Type: vm ioctl
+:Parameters: struct kvm_pmu_event_filter (in)
+:Returns: 0 on success, -1 on error
  
-struct kvm_pmu_event_filter {
+::
+
+  struct kvm_pmu_event_filter {
         __u32 action;
         __u32 nevents;
         __u32 fixed_counter_bitmap;
         __u32 flags;
         __u32 pad[4];
         __u64 events[0];
-};
+  };
  
  This ioctl restricts the set of PMU events that the guest can program.
  The argument holds a list of events which will be allowed or denied.
@@ -4154,20 +4581,26 @@ counters are controlled by the fixed_counter_bitmap.
  
  No flags are defined yet, the field must be zero.
  
-Valid values for 'action':
-#define KVM_PMU_EVENT_ALLOW 0
-#define KVM_PMU_EVENT_DENY 1
+Valid values for 'action'::
+
+  #define KVM_PMU_EVENT_ALLOW 0
+  #define KVM_PMU_EVENT_DENY 1
  
  4.121 KVM_PPC_SVM_OFF
+---------------------
+
+:Capability: basic
+:Architectures: powerpc
+:Type: vm ioctl
+:Parameters: none
+:Returns: 0 on successful completion,
  
-Capability: basic
-Architectures: powerpc
-Type: vm ioctl
-Parameters: none
-Returns: 0 on successful completion,
  Errors:
-  EINVAL:    if ultravisor failed to terminate the secure guest
-  ENOMEM:    if hypervisor failed to allocate new radix page tables for guest
+
+  ======     ================================================================
+  EINVAL     if ultravisor failed to terminate the secure guest
+  ENOMEM     if hypervisor failed to allocate new radix page tables for guest
+  ======     ================================================================
  
  This ioctl is used to turn off the secure mode of the guest or transition
  the guest from secure mode to normal mode. This is invoked when the guest
@@ -4178,35 +4611,38 @@ unpins the VPA pages and releases all the device pages that are used to
  track the secure pages by hypervisor.
  
  4.122 KVM_S390_NORMAL_RESET
+---------------------------
  
-Capability: KVM_CAP_S390_VCPU_RESETS
-Architectures: s390
-Type: vcpu ioctl
-Parameters: none
-Returns: 0
+:Capability: KVM_CAP_S390_VCPU_RESETS
+:Architectures: s390
+:Type: vcpu ioctl
+:Parameters: none
+:Returns: 0
  
  This ioctl resets VCPU registers and control structures according to
  the cpu reset definition in the POP (Principles Of Operation).
  
  4.123 KVM_S390_INITIAL_RESET
+----------------------------
  
-Capability: none
-Architectures: s390
-Type: vcpu ioctl
-Parameters: none
-Returns: 0
+:Capability: none
+:Architectures: s390
+:Type: vcpu ioctl
+:Parameters: none
+:Returns: 0
  
  This ioctl resets VCPU registers and control structures according to
  the initial cpu reset definition in the POP. However, the cpu is not
  put into ESA mode. This reset is a superset of the normal reset.
  
  4.124 KVM_S390_CLEAR_RESET
+--------------------------
  
-Capability: KVM_CAP_S390_VCPU_RESETS
-Architectures: s390
-Type: vcpu ioctl
-Parameters: none
-Returns: 0
+:Capability: KVM_CAP_S390_VCPU_RESETS
+:Architectures: s390
+:Type: vcpu ioctl
+:Parameters: none
+:Returns: 0
  
  This ioctl resets VCPU registers and control structures according to
  the clear cpu reset definition in the POP. However, the cpu is not put
@@ -4214,7 +4650,7 @@ into ESA mode. This reset is a superset of the initial reset.
  
  
  5. The kvm_run structure
-------------------------
+========================
  
  Application code obtains a pointer to the kvm_run structure by
  mmap()ing a vcpu fd.  From that point, application code can control
@@ -4222,13 +4658,17 @@ execution by changing fields in kvm_run prior to calling the KVM_RUN
  ioctl, and obtain information about the reason KVM_RUN returned by
  looking up structure members.
  
-struct kvm_run {
+::
+
+  struct kvm_run {
         /* in */
         __u8 request_interrupt_window;
  
  Request that KVM_RUN return when it becomes possible to inject external
  interrupts into the guest.  Useful in conjunction with KVM_INTERRUPT.
  
+::
+
         __u8 immediate_exit;
  
  This field is polled once when KVM_RUN starts; if non-zero, KVM_RUN
@@ -4240,6 +4680,8 @@ a signal handler that sets run->immediate_exit to a non-zero value.
  
  This field is ignored if KVM_CAP_IMMEDIATE_EXIT is not available.
  
+::
+
         __u8 padding1[6];
  
         /* out */
@@ -4249,16 +4691,22 @@ When KVM_RUN has returned successfully (return value 0), this informs
  application code why KVM_RUN has returned.  Allowable values for this
  field are detailed below.
  
+::
+
         __u8 ready_for_interrupt_injection;
  
  If request_interrupt_window has been specified, this field indicates
  an interrupt can be injected now with KVM_INTERRUPT.
  
+::
+
         __u8 if_flag;
  
  The value of the current interrupt flag.  Only valid if in-kernel
  local APIC is not used.
  
+::
+
         __u16 flags;
  
  More architecture-specific flags detailing state of the VCPU that may
@@ -4266,17 +4714,23 @@ affect the device's behavior.  The only currently defined flag is
  KVM_RUN_X86_SMM, which is valid on x86 machines and is set if the
  VCPU is in system management mode.
  
+::
+
         /* in (pre_kvm_run), out (post_kvm_run) */
         __u64 cr8;
  
  The value of the cr8 register.  Only valid if in-kernel local APIC is
  not used.  Both input and output.
  
+::
+
         __u64 apic_base;
  
  The value of the APIC BASE msr.  Only valid if in-kernel local
  APIC is not used.  Both input and output.
  
+::
+
         union {
                 /* KVM_EXIT_UNKNOWN */
                 struct {
@@ -4287,6 +4741,8 @@ If exit_reason is KVM_EXIT_UNKNOWN, the vcpu has exited due to unknown
  reasons.  Further architecture-specific information is available in
  hardware_exit_reason.
  
+::
+
                 /* KVM_EXIT_FAIL_ENTRY */
                 struct {
                         __u64 hardware_entry_failure_reason;
@@ -4296,6 +4752,8 @@ If exit_reason is KVM_EXIT_FAIL_ENTRY, the vcpu could not be run due
  to unknown reasons.  Further architecture-specific information is
  available in hardware_entry_failure_reason.
  
+::
+
                 /* KVM_EXIT_EXCEPTION */
                 struct {
                         __u32 exception;
@@ -4304,10 +4762,12 @@ available in hardware_entry_failure_reason.
  
  Unused.
  
+::
+
                 /* KVM_EXIT_IO */
                 struct {
-#define KVM_EXIT_IO_IN  0
-#define KVM_EXIT_IO_OUT 1
+  #define KVM_EXIT_IO_IN  0
+  #define KVM_EXIT_IO_OUT 1
                         __u8 direction;
                         __u8 size; /* bytes */
                         __u16 port;
@@ -4321,6 +4781,8 @@ data_offset describes where the data is located (KVM_EXIT_IO_OUT) or
  where kvm expects application code to place the data for the next
  KVM_RUN invocation (KVM_EXIT_IO_IN).  Data format is a packed array.
  
+::
+
                 /* KVM_EXIT_DEBUG */
                 struct {
                         struct kvm_debug_exit_arch arch;
@@ -4329,6 +4791,8 @@ KVM_RUN invocation (KVM_EXIT_IO_IN).  Data format is a packed array.
  If the exit_reason is KVM_EXIT_DEBUG, then a vcpu is processing a debug event
  for which architecture specific information is returned.
  
+::
+
                 /* KVM_EXIT_MMIO */
                 struct {
                         __u64 phys_addr;
@@ -4346,14 +4810,19 @@ The 'data' member contains, in its first 'len' bytes, the value as it would
  appear if the VCPU performed a load or store of the appropriate width directly
  to the byte array.
  
-NOTE: For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR and
+.. note::
+
+      For KVM_EXIT_IO, KVM_EXIT_MMIO, KVM_EXIT_OSI, KVM_EXIT_PAPR and
        KVM_EXIT_EPR the corresponding
+
  operations are complete (and guest state is consistent) only after userspace
  has re-entered the kernel with KVM_RUN.  The kernel side will first finish
  incomplete operations and then check for pending signals.  Userspace
  can re-enter the guest with an unmasked signal pending to complete
  pending operations.
  
+::
+
                 /* KVM_EXIT_HYPERCALL */
                 struct {
                         __u64 nr;
@@ -4365,7 +4834,10 @@ pending operations.
  
  Unused.  This was once used for 'hypercall to userspace'.  To implement
  such functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO (all except s390).
-Note KVM_EXIT_IO is significantly faster than KVM_EXIT_MMIO.
+
+.. note:: KVM_EXIT_IO is significantly faster than KVM_EXIT_MMIO.
+
+::
  
                 /* KVM_EXIT_TPR_ACCESS */
                 struct {
@@ -4376,6 +4848,8 @@ Note KVM_EXIT_IO is significantly faster than KVM_EXIT_MMIO.
  
  To be documented (KVM_TPR_ACCESS_REPORTING).
  
+::
+
                 /* KVM_EXIT_S390_SIEIC */
                 struct {
                         __u8 icptcode;
@@ -4387,16 +4861,20 @@ To be documented (KVM_TPR_ACCESS_REPORTING).
  
  s390 specific.
  
+::
+
                 /* KVM_EXIT_S390_RESET */
-#define KVM_S390_RESET_POR       1
-#define KVM_S390_RESET_CLEAR     2
-#define KVM_S390_RESET_SUBSYSTEM 4
-#define KVM_S390_RESET_CPU_INIT  8
-#define KVM_S390_RESET_IPL       16
+  #define KVM_S390_RESET_POR       1
+  #define KVM_S390_RESET_CLEAR     2
+  #define KVM_S390_RESET_SUBSYSTEM 4
+  #define KVM_S390_RESET_CPU_INIT  8
+  #define KVM_S390_RESET_IPL       16
                 __u64 s390_reset_flags;
  
  s390 specific.
  
+::
+
                 /* KVM_EXIT_S390_UCONTROL */
                 struct {
                         __u64 trans_exc_code;
@@ -4411,6 +4889,8 @@ in the cpu's lowcore are presented here as defined by the z Architecture
  Principles of Operation Book in the Chapter for Dynamic Address Translation
  (DAT)
  
+::
+
                 /* KVM_EXIT_DCR */
                 struct {
                         __u32 dcrn;
@@ -4420,6 +4900,8 @@ Principles of Operation Book in the Chapter for Dynamic Address Translation
  
  Deprecated - was used for 440 KVM.
  
+::
+
                 /* KVM_EXIT_OSI */
                 struct {
                         __u64 gprs[32];
@@ -4433,6 +4915,8 @@ Userspace can now handle the hypercall and when it's done modify the gprs as
  necessary. Upon guest entry all guest GPRs will then be replaced by the values
  in this struct.
  
+::
+
                 /* KVM_EXIT_PAPR_HCALL */
                 struct {
                         __u64 nr;
@@ -4450,6 +4934,8 @@ The possible hypercalls are defined in the Power Architecture Platform
  Requirements (PAPR) document available from www.power.org (free
  developer registration required to access it).
  
+::
+
                 /* KVM_EXIT_S390_TSCH */
                 struct {
                         __u16 subchannel_id;
@@ -4466,6 +4952,8 @@ interrupt for the target subchannel has been dequeued and subchannel_id,
  subchannel_nr, io_int_parm and io_int_word contain the parameters for that
  interrupt. ipb is needed for instruction parameter decoding.
  
+::
+
                 /* KVM_EXIT_EPR */
                 struct {
                         __u32 epr;
@@ -4485,11 +4973,13 @@ It gets triggered whenever both KVM_CAP_PPC_EPR are enabled and an
  external interrupt has just been delivered into the guest. User space
  should put the acknowledged interrupt vector into the 'epr' field.
  
+::
+
                 /* KVM_EXIT_SYSTEM_EVENT */
                 struct {
-#define KVM_SYSTEM_EVENT_SHUTDOWN       1
-#define KVM_SYSTEM_EVENT_RESET          2
-#define KVM_SYSTEM_EVENT_CRASH          3
+  #define KVM_SYSTEM_EVENT_SHUTDOWN       1
+  #define KVM_SYSTEM_EVENT_RESET          2
+  #define KVM_SYSTEM_EVENT_CRASH          3
                         __u32 type;
                         __u64 flags;
                 } system_event;
@@ -4502,18 +4992,21 @@ the system-level event type. The 'flags' field describes architecture
  specific flags for the system-level event.
  
  Valid values for 'type' are:
-  KVM_SYSTEM_EVENT_SHUTDOWN -- the guest has requested a shutdown of the
+
+ - KVM_SYSTEM_EVENT_SHUTDOWN -- the guest has requested a shutdown of the
     VM. Userspace is not obliged to honour this, and if it does honour
     this does not need to destroy the VM synchronously (ie it may call
     KVM_RUN again before shutdown finally occurs).
-  KVM_SYSTEM_EVENT_RESET -- the guest has requested a reset of the VM.
+ - KVM_SYSTEM_EVENT_RESET -- the guest has requested a reset of the VM.
     As with SHUTDOWN, userspace can choose to ignore the request, or
     to schedule the reset to occur in the future and may call KVM_RUN again.
-  KVM_SYSTEM_EVENT_CRASH -- the guest crash occurred and the guest
+ - KVM_SYSTEM_EVENT_CRASH -- the guest crash occurred and the guest
     has requested a crash condition maintenance. Userspace can choose
     to ignore the request, or to gather VM memory core dump and/or
     reset/shutdown of the VM.
  
+::
+
                 /* KVM_EXIT_IOAPIC_EOI */
                 struct {
                         __u8 vector;
@@ -4526,9 +5019,11 @@ the userspace IOAPIC should process the EOI and retrigger the interrupt if
  it is still asserted.  Vector is the LAPIC interrupt vector for which the
  EOI was received.
  
+::
+
                 struct kvm_hyperv_exit {
-#define KVM_EXIT_HYPERV_SYNIC          1
-#define KVM_EXIT_HYPERV_HCALL          2
+  #define KVM_EXIT_HYPERV_SYNIC          1
+  #define KVM_EXIT_HYPERV_HCALL          2
                         __u32 type;
                         union {
                                 struct {
@@ -4546,14 +5041,20 @@ EOI was received.
                 };
                 /* KVM_EXIT_HYPERV */
                  struct kvm_hyperv_exit hyperv;
+
  Indicates that the VCPU exits into userspace to process some tasks
  related to Hyper-V emulation.
+
  Valid values for 'type' are:
-       KVM_EXIT_HYPERV_SYNIC -- synchronously notify user-space about
+
+       - KVM_EXIT_HYPERV_SYNIC -- synchronously notify user-space about
+
  Hyper-V SynIC state change. Notification is used to remap SynIC
  event/message pages and to enable/disable SynIC messages/events processing
  in userspace.
  
+::
+
                 /* KVM_EXIT_ARM_NISV */
                 struct {
                         __u64 esr_iss;
@@ -4587,6 +5088,8 @@ Note that KVM does not skip the faulting instruction as it does for
  KVM_EXIT_MMIO, but userspace has to emulate any change to the processing state
  if it decides to decode and emulate the instruction.
  
+::
+
                 /* Fix the size of the union. */
                 char padding[256];
         };
@@ -4611,18 +5114,20 @@ avoid some system call overhead if userspace has to handle the exit.
  Userspace can query the validity of the structure by checking
  kvm_valid_regs for specific bits. These bits are architecture specific
  and usually define the validity of a groups of registers. (e.g. one bit
- for general purpose registers)
+for general purpose registers)
  
  Please note that the kernel is allowed to use the kvm_run structure as the
  primary storage for certain register types. Therefore, the kernel may use the
  values in kvm_run even if the corresponding bit in kvm_dirty_regs is not set.
  
-};
+::
+
+  };
  
  
  
  6. Capabilities that can be enabled on vCPUs
---------------------------------------------
+============================================
  
  There are certain capabilities that change the behavior of the virtual CPU or
  the virtual machine when enabled. To enable them, please see section 4.37.
@@ -4631,23 +5136,28 @@ the virtual machine is when enabling them.
  
  The following information is provided along with the description:
  
-  Architectures: which instruction set architectures provide this ioctl.
+  Architectures:
+      which instruction set architectures provide this ioctl.
        x86 includes both i386 and x86_64.
  
-  Target: whether this is a per-vcpu or per-vm capability.
+  Target:
+      whether this is a per-vcpu or per-vm capability.
  
-  Parameters: what parameters are accepted by the capability.
+  Parameters:
+      what parameters are accepted by the capability.
  
-  Returns: the return value.  General error numbers (EBADF, ENOMEM, EINVAL)
+  Returns:
+      the return value.  General error numbers (EBADF, ENOMEM, EINVAL)
        are not detailed, but errors with specific meanings are.
  
  
  6.1 KVM_CAP_PPC_OSI
+-------------------
  
-Architectures: ppc
-Target: vcpu
-Parameters: none
-Returns: 0 on success; -1 on error
+:Architectures: ppc
+:Target: vcpu
+:Parameters: none
+:Returns: 0 on success; -1 on error
  
  This capability enables interception of OSI hypercalls that otherwise would
  be treated as normal system calls to be injected into the guest. OSI hypercalls
@@ -4658,11 +5168,12 @@ When this capability is enabled, KVM_EXIT_OSI can occur.
  
  
  6.2 KVM_CAP_PPC_PAPR
+--------------------
  
-Architectures: ppc
-Target: vcpu
-Parameters: none
-Returns: 0 on success; -1 on error
+:Architectures: ppc
+:Target: vcpu
+:Parameters: none
+:Returns: 0 on success; -1 on error
  
  This capability enables interception of PAPR hypercalls. PAPR hypercalls are
  done using the hypercall instruction "sc 1".
@@ -4678,18 +5189,21 @@ When this capability is enabled, KVM_EXIT_PAPR_HCALL can occur.
  
  
  6.3 KVM_CAP_SW_TLB
+------------------
+
+:Architectures: ppc
+:Target: vcpu
+:Parameters: args[0] is the address of a struct kvm_config_tlb
+:Returns: 0 on success; -1 on error
  
-Architectures: ppc
-Target: vcpu
-Parameters: args[0] is the address of a struct kvm_config_tlb
-Returns: 0 on success; -1 on error
+::
  
-struct kvm_config_tlb {
+  struct kvm_config_tlb {
         __u64 params;
         __u64 array;
         __u32 mmu_type;
         __u32 array_len;
-};
+  };
  
  Configures the virtual CPU's TLB array, establishing a shared memory area
  between userspace and KVM.  The "params" and "array" fields are userspace
@@ -4708,6 +5222,7 @@ to tell KVM which entries have been changed, prior to calling KVM_RUN again
  on this vcpu.
  
  For mmu types KVM_MMU_FSL_BOOKE_NOHV and KVM_MMU_FSL_BOOKE_HV:
+
   - The "params" field is of type "struct kvm_book3e_206_tlb_params".
   - The "array" field points to an array of type "struct
     kvm_book3e_206_tlb_entry".
@@ -4721,11 +5236,12 @@ For mmu types KVM_MMU_FSL_BOOKE_NOHV and KVM_MMU_FSL_BOOKE_HV:
     hardware ignores this value for TLB0.
  
  6.4 KVM_CAP_S390_CSS_SUPPORT
+----------------------------
  
-Architectures: s390
-Target: vcpu
-Parameters: none
-Returns: 0 on success; -1 on error
+:Architectures: s390
+:Target: vcpu
+:Parameters: none
+:Returns: 0 on success; -1 on error
  
  This capability enables support for handling of channel I/O instructions.
  
@@ -4739,11 +5255,12 @@ Note that even though this capability is enabled per-vcpu, the complete
  virtual machine is affected.
  
  6.5 KVM_CAP_PPC_EPR
+-------------------
  
-Architectures: ppc
-Target: vcpu
-Parameters: args[0] defines whether the proxy facility is active
-Returns: 0 on success; -1 on error
+:Architectures: ppc
+:Target: vcpu
+:Parameters: args[0] defines whether the proxy facility is active
+:Returns: 0 on success; -1 on error
  
  This capability enables or disables the delivery of interrupts through the
  external proxy facility.
@@ -4757,62 +5274,70 @@ When disabled (args[0] == 0), behavior is as if this facility is unsupported.
  When this capability is enabled, KVM_EXIT_EPR can occur.
  
  6.6 KVM_CAP_IRQ_MPIC
+--------------------
  
-Architectures: ppc
-Parameters: args[0] is the MPIC device fd
-            args[1] is the MPIC CPU number for this vcpu
+:Architectures: ppc
+:Parameters: args[0] is the MPIC device fd;
+             args[1] is the MPIC CPU number for this vcpu
  
  This capability connects the vcpu to an in-kernel MPIC device.
  
  6.7 KVM_CAP_IRQ_XICS
+--------------------
  
-Architectures: ppc
-Target: vcpu
-Parameters: args[0] is the XICS device fd
-            args[1] is the XICS CPU number (server ID) for this vcpu
+:Architectures: ppc
+:Target: vcpu
+:Parameters: args[0] is the XICS device fd;
+             args[1] is the XICS CPU number (server ID) for this vcpu
  
  This capability connects the vcpu to an in-kernel XICS device.
  
  6.8 KVM_CAP_S390_IRQCHIP
+------------------------
  
-Architectures: s390
-Target: vm
-Parameters: none
+:Architectures: s390
+:Target: vm
+:Parameters: none
  
  This capability enables the in-kernel irqchip for s390. Please refer to
  "4.24 KVM_CREATE_IRQCHIP" for details.
  
  6.9 KVM_CAP_MIPS_FPU
+--------------------
  
-Architectures: mips
-Target: vcpu
-Parameters: args[0] is reserved for future use (should be 0).
+:Architectures: mips
+:Target: vcpu
+:Parameters: args[0] is reserved for future use (should be 0).
  
  This capability allows the use of the host Floating Point Unit by the guest. It
  allows the Config1.FP bit to be set to enable the FPU in the guest. Once this is
-done the KVM_REG_MIPS_FPR_* and KVM_REG_MIPS_FCR_* registers can be accessed
-(depending on the current guest FPU register mode), and the Status.FR,
+done the ``KVM_REG_MIPS_FPR_*`` and ``KVM_REG_MIPS_FCR_*`` registers can be
+accessed (depending on the current guest FPU register mode), and the Status.FR,
  Config5.FRE bits are accessible via the KVM API and also from the guest,
  depending on them being supported by the FPU.
  
  6.10 KVM_CAP_MIPS_MSA
+---------------------
  
-Architectures: mips
-Target: vcpu
-Parameters: args[0] is reserved for future use (should be 0).
+:Architectures: mips
+:Target: vcpu
+:Parameters: args[0] is reserved for future use (should be 0).
  
  This capability allows the use of the MIPS SIMD Architecture (MSA) by the guest.
  It allows the Config3.MSAP bit to be set to enable the use of MSA by the guest.
-Once this is done the KVM_REG_MIPS_VEC_* and KVM_REG_MIPS_MSA_* registers can be
-accessed, and the Config5.MSAEn bit is accessible via the KVM API and also from
-the guest.
+Once this is done the ``KVM_REG_MIPS_VEC_*`` and ``KVM_REG_MIPS_MSA_*``
+registers can be accessed, and the Config5.MSAEn bit is accessible via the
+KVM API and also from the guest.
  
  6.74 KVM_CAP_SYNC_REGS
-Architectures: s390, x86
-Target: s390: always enabled, x86: vcpu
-Parameters: none
-Returns: x86: KVM_CHECK_EXTENSION returns a bit-array indicating which register
-sets are supported (bitfields defined in arch/x86/include/uapi/asm/kvm.h).
+----------------------
+
+:Architectures: s390, x86
+:Target: s390: always enabled, x86: vcpu
+:Parameters: none
+:Returns: x86: KVM_CHECK_EXTENSION returns a bit-array indicating which register
+          sets are supported
+          (bitfields defined in arch/x86/include/uapi/asm/kvm.h).
  
  As described above in the kvm_sync_regs struct info in section 5 (kvm_run):
  KVM_CAP_SYNC_REGS "allow[s] userspace to access certain guest registers
@@ -4825,6 +5350,7 @@ userspace.
  For s390 specifics, please refer to the source code.
  
  For x86:
+
  - the register sets to be copied out to kvm_run are selectable
    by userspace (rather that all sets being copied out for every exit).
  - vcpu_events are available in addition to regs and sregs.
@@ -4841,23 +5367,26 @@ into the vCPU even if they've been modified.
  
  Unused bitfields in the bitarrays must be set to zero.
  
-struct kvm_sync_regs {
+::
+
+  struct kvm_sync_regs {
          struct kvm_regs regs;
          struct kvm_sregs sregs;
          struct kvm_vcpu_events events;
-};
+  };
  
  6.75 KVM_CAP_PPC_IRQ_XIVE
+-------------------------
  
-Architectures: ppc
-Target: vcpu
-Parameters: args[0] is the XIVE device fd
-            args[1] is the XIVE CPU number (server ID) for this vcpu
+:Architectures: ppc
+:Target: vcpu
+:Parameters: args[0] is the XIVE device fd;
+             args[1] is the XIVE CPU number (server ID) for this vcpu
  
  This capability connects the vcpu to an in-kernel XIVE device.
  
  7. Capabilities that can be enabled on VMs
-------------------------------------------
+==========================================
  
  There are certain capabilities that change the behavior of the virtual
  machine when enabled. To enable them, please see section 4.37. Below
@@ -4866,20 +5395,24 @@ is when enabling them.
  
  The following information is provided along with the description:
  
-  Architectures: which instruction set architectures provide this ioctl.
+  Architectures:
+      which instruction set architectures provide this ioctl.
        x86 includes both i386 and x86_64.
  
-  Parameters: what parameters are accepted by the capability.
+  Parameters:
+      what parameters are accepted by the capability.
  
-  Returns: the return value.  General error numbers (EBADF, ENOMEM, EINVAL)
+  Returns:
+      the return value.  General error numbers (EBADF, ENOMEM, EINVAL)
        are not detailed, but errors with specific meanings are.
  
  
  7.1 KVM_CAP_PPC_ENABLE_HCALL
+----------------------------
  
-Architectures: ppc
-Parameters: args[0] is the sPAPR hcall number
-           args[1] is 0 to disable, 1 to enable in-kernel handling
+:Architectures: ppc
+:Parameters: args[0] is the sPAPR hcall number;
+            args[1] is 0 to disable, 1 to enable in-kernel handling
  
  This capability controls whether individual sPAPR hypercalls (hcalls)
  get handled by the kernel or not.  Enabling or disabling in-kernel
@@ -4897,13 +5430,15 @@ implementation, the KVM_ENABLE_CAP ioctl will fail with an EINVAL
  error.
  
  7.2 KVM_CAP_S390_USER_SIGP
+--------------------------
  
-Architectures: s390
-Parameters: none
+:Architectures: s390
+:Parameters: none
  
  This capability controls which SIGP orders will be handled completely in user
  space. With this capability enabled, all fast orders will be handled completely
  in the kernel:
+
  - SENSE
  - SENSE RUNNING
  - EXTERNAL CALL
@@ -4917,48 +5452,52 @@ in the hardware prior to interception). If this capability is not enabled, the
  old way of handling SIGP orders is used (partially in kernel and user space).
  
  7.3 KVM_CAP_S390_VECTOR_REGISTERS
+---------------------------------
  
-Architectures: s390
-Parameters: none
-Returns: 0 on success, negative value on error
+:Architectures: s390
+:Parameters: none
+:Returns: 0 on success, negative value on error
  
  Allows use of the vector registers introduced with z13 processor, and
  provides for the synchronization between host and user space.  Will
  return -EINVAL if the machine does not support vectors.
  
  7.4 KVM_CAP_S390_USER_STSI
+--------------------------
  
-Architectures: s390
-Parameters: none
+:Architectures: s390
+:Parameters: none
  
  This capability allows post-handlers for the STSI instruction. After
  initial handling in the kernel, KVM exits to user space with
  KVM_EXIT_S390_STSI to allow user space to insert further data.
  
  Before exiting to userspace, kvm handlers should fill in s390_stsi field of
-vcpu->run:
-struct {
+vcpu->run::
+
+  struct {
         __u64 addr;
         __u8 ar;
         __u8 reserved;
         __u8 fc;
         __u8 sel1;
         __u16 sel2;
-} s390_stsi;
+  } s390_stsi;
  
-@addr - guest address of STSI SYSIB
-@fc   - function code
-@sel1 - selector 1
-@sel2 - selector 2
-@ar   - access register number
+  @addr - guest address of STSI SYSIB
+  @fc   - function code
+  @sel1 - selector 1
+  @sel2 - selector 2
+  @ar   - access register number
  
  KVM handlers should exit to userspace with rc = -EREMOTE.
  
  7.5 KVM_CAP_SPLIT_IRQCHIP
+-------------------------
  
-Architectures: x86
-Parameters: args[0] - number of routes reserved for userspace IOAPICs
-Returns: 0 on success, -1 on error
+:Architectures: x86
+:Parameters: args[0] - number of routes reserved for userspace IOAPICs
+:Returns: 0 on success, -1 on error
  
  Create a local apic for each processor in the kernel. This can be used
  instead of KVM_CREATE_IRQCHIP if the userspace VMM wishes to emulate the
@@ -4975,24 +5514,26 @@ Fails if VCPU has already been created, or if the irqchip is already in the
  kernel (i.e. KVM_CREATE_IRQCHIP has already been called).
  
  7.6 KVM_CAP_S390_RI
+-------------------
  
-Architectures: s390
-Parameters: none
+:Architectures: s390
+:Parameters: none
  
  Allows use of runtime-instrumentation introduced with zEC12 processor.
  Will return -EINVAL if the machine does not support runtime-instrumentation.
  Will return -EBUSY if a VCPU has already been created.
  
  7.7 KVM_CAP_X2APIC_API
+----------------------
  
-Architectures: x86
-Parameters: args[0] - features that should be enabled
-Returns: 0 on success, -EINVAL when args[0] contains invalid features
+:Architectures: x86
+:Parameters: args[0] - features that should be enabled
+:Returns: 0 on success, -EINVAL when args[0] contains invalid features
  
-Valid feature flags in args[0] are
+Valid feature flags in args[0] are::
  
-#define KVM_X2APIC_API_USE_32BIT_IDS            (1ULL << 0)
-#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK  (1ULL << 1)
+  #define KVM_X2APIC_API_USE_32BIT_IDS            (1ULL << 0)
+  #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK  (1ULL << 1)
  
  Enabling KVM_X2APIC_API_USE_32BIT_IDS changes the behavior of
  KVM_SET_GSI_ROUTING, KVM_SIGNAL_MSI, KVM_SET_LAPIC, and KVM_GET_LAPIC,
@@ -5006,9 +5547,10 @@ without interrupt remapping.  This is undesirable in logical mode,
  where 0xff represents CPUs 0-7 in cluster 0.
  
  7.8 KVM_CAP_S390_USER_INSTR0
+----------------------------
  
-Architectures: s390
-Parameters: none
+:Architectures: s390
+:Parameters: none
  
  With this capability enabled, all illegal instructions 0x0000 (2 bytes) will
  be intercepted and forwarded to user space. User space can use this
@@ -5020,26 +5562,29 @@ This capability can be enabled dynamically even if VCPUs were already
  created and are running.
  
  7.9 KVM_CAP_S390_GS
+-------------------
  
-Architectures: s390
-Parameters: none
-Returns: 0 on success; -EINVAL if the machine does not support
-        guarded storage; -EBUSY if a VCPU has already been created.
+:Architectures: s390
+:Parameters: none
+:Returns: 0 on success; -EINVAL if the machine does not support
+          guarded storage; -EBUSY if a VCPU has already been created.
  
  Allows use of guarded storage for the KVM guest.
  
  7.10 KVM_CAP_S390_AIS
+---------------------
  
-Architectures: s390
-Parameters: none
+:Architectures: s390
+:Parameters: none
  
  Allow use of adapter-interruption suppression.
-Returns: 0 on success; -EBUSY if a VCPU has already been created.
+:Returns: 0 on success; -EBUSY if a VCPU has already been created.
  
  7.11 KVM_CAP_PPC_SMT
+--------------------
  
-Architectures: ppc
-Parameters: vsmt_mode, flags
+:Architectures: ppc
+:Parameters: vsmt_mode, flags
  
  Enabling this capability on a VM provides userspace with a way to set
  the desired virtual SMT mode (i.e. the number of virtual CPUs per
@@ -5054,9 +5599,10 @@ The KVM_CAP_PPC_SMT_POSSIBLE capability indicates which virtual SMT
  modes are available.
  
  7.12 KVM_CAP_PPC_FWNMI
+----------------------
  
-Architectures: ppc
-Parameters: none
+:Architectures: ppc
+:Parameters: none
  
  With this capability a machine check exception in the guest address
  space will cause KVM to exit the guest with NMI exit reason. This
@@ -5065,17 +5611,18 @@ machine check handling routine. Without this capability KVM will
  branch to guests' 0x200 interrupt vector.
  
  7.13 KVM_CAP_X86_DISABLE_EXITS
+------------------------------
  
-Architectures: x86
-Parameters: args[0] defines which exits are disabled
-Returns: 0 on success, -EINVAL when args[0] contains invalid exits
+:Architectures: x86
+:Parameters: args[0] defines which exits are disabled
+:Returns: 0 on success, -EINVAL when args[0] contains invalid exits
  
-Valid bits in args[0] are
+Valid bits in args[0] are::
  
-#define KVM_X86_DISABLE_EXITS_MWAIT            (1 << 0)
-#define KVM_X86_DISABLE_EXITS_HLT              (1 << 1)
-#define KVM_X86_DISABLE_EXITS_PAUSE            (1 << 2)
-#define KVM_X86_DISABLE_EXITS_CSTATE           (1 << 3)
+  #define KVM_X86_DISABLE_EXITS_MWAIT            (1 << 0)
+  #define KVM_X86_DISABLE_EXITS_HLT              (1 << 1)
+  #define KVM_X86_DISABLE_EXITS_PAUSE            (1 << 2)
+  #define KVM_X86_DISABLE_EXITS_CSTATE           (1 << 3)
  
  Enabling this capability on a VM provides userspace with a way to no
  longer intercept some instructions for improved latency in some
@@ -5087,12 +5634,13 @@ all such vmexits.
  Do not enable KVM_FEATURE_PV_UNHALT if you disable HLT exits.
  
  7.14 KVM_CAP_S390_HPAGE_1M
+--------------------------
  
-Architectures: s390
-Parameters: none
-Returns: 0 on success, -EINVAL if hpage module parameter was not set
-        or cmma is enabled, or the VM has the KVM_VM_S390_UCONTROL
-        flag set
+:Architectures: s390
+:Parameters: none
+:Returns: 0 on success, -EINVAL if hpage module parameter was not set
+         or cmma is enabled, or the VM has the KVM_VM_S390_UCONTROL
+         flag set
  
  With this capability the KVM support for memory backing with 1m pages
  through hugetlbfs can be enabled for a VM. After the capability is
@@ -5104,20 +5652,22 @@ While it is generally possible to create a huge page backed VM without
  this capability, the VM will not be able to run.
  
  7.15 KVM_CAP_MSR_PLATFORM_INFO
+------------------------------
  
-Architectures: x86
-Parameters: args[0] whether feature should be enabled or not
+:Architectures: x86
+:Parameters: args[0] whether feature should be enabled or not
  
  With this capability, a guest may read the MSR_PLATFORM_INFO MSR. Otherwise,
  a #GP would be raised when the guest tries to access. Currently, this
  capability does not enable write permissions of this MSR for the guest.
  
  7.16 KVM_CAP_PPC_NESTED_HV
+--------------------------
  
-Architectures: ppc
-Parameters: none
-Returns: 0 on success, -EINVAL when the implementation doesn't support
-        nested-HV virtualization.
+:Architectures: ppc
+:Parameters: none
+:Returns: 0 on success, -EINVAL when the implementation doesn't support
+         nested-HV virtualization.
  
  HV-KVM on POWER9 and later systems allows for "nested-HV"
  virtualization, which provides a way for a guest VM to run guests that
@@ -5127,9 +5677,10 @@ the necessary functionality and on the facility being enabled with a
  kvm-hv module parameter.
  
  7.17 KVM_CAP_EXCEPTION_PAYLOAD
+------------------------------
  
-Architectures: x86
-Parameters: args[0] whether feature should be enabled or not
+:Architectures: x86
+:Parameters: args[0] whether feature should be enabled or not
  
  With this capability enabled, CR2 will not be modified prior to the
  emulated VM-exit when L1 intercepts a #PF exception that occurs in
@@ -5140,21 +5691,21 @@ L2. As a result, when KVM_GET_VCPU_EVENTS reports a pending #PF (or
  faulting address (or the new DR6 bits*) will be reported in the
  exception_payload field. Similarly, when userspace injects a #PF (or
  #DB) into L2 using KVM_SET_VCPU_EVENTS, it is expected to set
-exception.has_payload and to put the faulting address (or the new DR6
-bits*) in the exception_payload field.
+exception.has_payload and to put the faulting address - or the new DR6
+bits\ [#]_ - in the exception_payload field.
  
  This capability also enables exception.pending in struct
  kvm_vcpu_events, which allows userspace to distinguish between pending
  and injected exceptions.
  
  
-* For the new DR6 bits, note that bit 16 is set iff the #DB exception
-  will clear DR6.RTM.
+.. [#] For the new DR6 bits, note that bit 16 is set iff the #DB exception
+       will clear DR6.RTM.
  
  7.18 KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2
  
-Architectures: x86, arm, arm64, mips
-Parameters: args[0] whether feature should be enabled or not
+:Architectures: x86, arm, arm64, mips
+:Parameters: args[0] whether feature should be enabled or not
  
  With this capability enabled, KVM_GET_DIRTY_LOG will not automatically
  clear and write-protect all pages that are returned as dirty.
@@ -5181,14 +5732,15 @@ KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 signals that those bugs are fixed.
  Userspace should not try to use KVM_CAP_MANUAL_DIRTY_LOG_PROTECT.
  
  8. Other capabilities.
-----------------------
+======================
  
  This section lists capabilities that give information about other
  features of the KVM implementation.
  
  8.1 KVM_CAP_PPC_HWRNG
+---------------------
  
-Architectures: ppc
+:Architectures: ppc
  
  This capability, if KVM_CHECK_EXTENSION indicates that it is
  available, means that that the kernel has an implementation of the
@@ -5197,8 +5749,10 @@ If present, the kernel H_RANDOM handler can be enabled for guest use
  with the KVM_CAP_PPC_ENABLE_HCALL capability.
  
  8.2 KVM_CAP_HYPERV_SYNIC
+------------------------
+
+:Architectures: x86
  
-Architectures: x86
  This capability, if KVM_CHECK_EXTENSION indicates that it is
  available, means that that the kernel has an implementation of the
  Hyper-V Synthetic interrupt controller(SynIC). Hyper-V SynIC is
@@ -5210,8 +5764,9 @@ will disable the use of APIC hardware virtualization even if supported
  by the CPU, as it's incompatible with SynIC auto-EOI behavior.
  
  8.3 KVM_CAP_PPC_RADIX_MMU
+-------------------------
  
-Architectures: ppc
+:Architectures: ppc
  
  This capability, if KVM_CHECK_EXTENSION indicates that it is
  available, means that that the kernel can support guests using the
@@ -5219,8 +5774,9 @@ radix MMU defined in Power ISA V3.00 (as implemented in the POWER9
  processor).
  
  8.4 KVM_CAP_PPC_HASH_MMU_V3
+---------------------------
  
-Architectures: ppc
+:Architectures: ppc
  
  This capability, if KVM_CHECK_EXTENSION indicates that it is
  available, means that that the kernel can support guests using the
@@ -5228,8 +5784,9 @@ hashed page table MMU defined in Power ISA V3.00 (as implemented in
  the POWER9 processor), including in-memory segment tables.
  
  8.5 KVM_CAP_MIPS_VZ
+-------------------
  
-Architectures: mips
+:Architectures: mips
  
  This capability, if KVM_CHECK_EXTENSION on the main kvm handle indicates that
  it is available, means that full hardware assisted virtualization capabilities
@@ -5247,16 +5804,19 @@ values (see below). All other values are reserved. This is to allow for the
  possibility of other hardware assisted virtualization implementations which
  may be incompatible with the MIPS VZ ASE.
  
- 0: The trap & emulate implementation is in use to run guest code in user
+==  ==========================================================================
+ 0  The trap & emulate implementation is in use to run guest code in user
      mode. Guest virtual memory segments are rearranged to fit the guest in the
      user mode address space.
  
- 1: The MIPS VZ ASE is in use, providing full hardware assisted
+ 1  The MIPS VZ ASE is in use, providing full hardware assisted
      virtualization, including standard guest virtual memory segments.
+==  ==========================================================================
  
  8.6 KVM_CAP_MIPS_TE
+-------------------
  
-Architectures: mips
+:Architectures: mips
  
  This capability, if KVM_CHECK_EXTENSION on the main kvm handle indicates that
  it is available, means that the trap & emulate implementation is available to
@@ -5268,8 +5828,9 @@ If KVM_CHECK_EXTENSION on a kvm VM handle indicates that this capability is
  available, it means that the VM is using trap & emulate.
  
  8.7 KVM_CAP_MIPS_64BIT
+----------------------
  
-Architectures: mips
+:Architectures: mips
  
  This capability indicates the supported architecture type of the guest, i.e. the
  supported register and address width.
@@ -5279,22 +5840,26 @@ kvm VM handle correspond roughly to the CP0_Config.AT register field, and should
  be checked specifically against known values (see below). All other values are
  reserved.
  
- 0: MIPS32 or microMIPS32.
+==  ========================================================================
+ 0  MIPS32 or microMIPS32.
      Both registers and addresses are 32-bits wide.
      It will only be possible to run 32-bit guest code.
  
- 1: MIPS64 or microMIPS64 with access only to 32-bit compatibility segments.
+ 1  MIPS64 or microMIPS64 with access only to 32-bit compatibility segments.
      Registers are 64-bits wide, but addresses are 32-bits wide.
      64-bit guest code may run but cannot access MIPS64 memory segments.
      It will also be possible to run 32-bit guest code.
  
- 2: MIPS64 or microMIPS64 with access to all address segments.
+ 2  MIPS64 or microMIPS64 with access to all address segments.
      Both registers and addresses are 64-bits wide.
      It will be possible to run 64-bit or 32-bit guest code.
+==  ========================================================================
  
  8.9 KVM_CAP_ARM_USER_IRQ
+------------------------
+
+:Architectures: arm, arm64
  
-Architectures: arm, arm64
  This capability, if KVM_CHECK_EXTENSION indicates that it is available, means
  that if userspace creates a VM without an in-kernel interrupt controller, it
  will be notified of changes to the output level of in-kernel emulated devices,
@@ -5321,7 +5886,7 @@ If KVM_CAP_ARM_USER_IRQ is supported, the KVM_CHECK_EXTENSION ioctl returns a
  number larger than 0 indicating the version of this capability is implemented
  and thereby which bits in in run->s.regs.device_irq_level can signal values.
  
-Currently the following bits are defined for the device_irq_level bitmap:
+Currently the following bits are defined for the device_irq_level bitmap::
  
    KVM_CAP_ARM_USER_IRQ >= 1:
  
@@ -5334,8 +5899,9 @@ indicated by returning a higher number from KVM_CHECK_EXTENSION and will be
  listed above.
  
  8.10 KVM_CAP_PPC_SMT_POSSIBLE
+-----------------------------
  
-Architectures: ppc
+:Architectures: ppc
  
  Querying this capability returns a bitmap indicating the possible
  virtual SMT modes that can be set using KVM_CAP_PPC_SMT.  If bit N
@@ -5343,8 +5909,9 @@ virtual SMT modes that can be set using KVM_CAP_PPC_SMT.  If bit N
  available.
  
  8.11 KVM_CAP_HYPERV_SYNIC2
+--------------------------
  
-Architectures: x86
+:Architectures: x86
  
  This capability enables a newer version of Hyper-V Synthetic interrupt
  controller (SynIC).  The only difference with KVM_CAP_HYPERV_SYNIC is that KVM
@@ -5352,8 +5919,9 @@ doesn't clear SynIC message and event flags pages when they are enabled by
  writing to the respective MSRs.
  
  8.12 KVM_CAP_HYPERV_VP_INDEX
+----------------------------
  
-Architectures: x86
+:Architectures: x86
  
  This capability indicates that userspace can load HV_X64_MSR_VP_INDEX msr.  Its
  value is used to denote the target vcpu for a SynIC interrupt.  For
@@ -5361,47 +5929,53 @@ compatibilty, KVM initializes this msr to KVM's internal vcpu index.  When this
  capability is absent, userspace can still query this msr's value.
  
  8.13 KVM_CAP_S390_AIS_MIGRATION
+-------------------------------
  
-Architectures: s390
-Parameters: none
+:Architectures: s390
+:Parameters: none
  
  This capability indicates if the flic device will be able to get/set the
  AIS states for migration via the KVM_DEV_FLIC_AISM_ALL attribute and allows
  to discover this without having to create a flic device.
  
  8.14 KVM_CAP_S390_PSW
+---------------------
  
-Architectures: s390
+:Architectures: s390
  
  This capability indicates that the PSW is exposed via the kvm_run structure.
  
  8.15 KVM_CAP_S390_GMAP
+----------------------
  
-Architectures: s390
+:Architectures: s390
  
  This capability indicates that the user space memory used as guest mapping can
  be anywhere in the user memory address space, as long as the memory slots are
  aligned and sized to a segment (1MB) boundary.
  
  8.16 KVM_CAP_S390_COW
+---------------------
  
-Architectures: s390
+:Architectures: s390
  
  This capability indicates that the user space memory used as guest mapping can
  use copy-on-write semantics as well as dirty pages tracking via read-only page
  tables.
  
  8.17 KVM_CAP_S390_BPB
+---------------------
  
-Architectures: s390
+:Architectures: s390
  
  This capability indicates that kvm will implement the interfaces to handle
  reset, migration and nested KVM for branch prediction blocking. The stfle
  facility 82 should not be provided to the guest without this capability.
  
  8.18 KVM_CAP_HYPERV_TLBFLUSH
+----------------------------
  
-Architectures: x86
+:Architectures: x86
  
  This capability indicates that KVM supports paravirtualized Hyper-V TLB Flush
  hypercalls:
@@ -5409,8 +5983,9 @@ HvFlushVirtualAddressSpace, HvFlushVirtualAddressSpaceEx,
  HvFlushVirtualAddressList, HvFlushVirtualAddressListEx.
  
  8.19 KVM_CAP_ARM_INJECT_SERROR_ESR
+----------------------------------
  
-Architectures: arm, arm64
+:Architectures: arm, arm64
  
  This capability indicates that userspace can specify (via the
  KVM_SET_VCPU_EVENTS ioctl) the syndrome value reported to the guest when it
@@ -5421,16 +5996,20 @@ CPU when the exception is taken. If this virtual SError is taken to EL1 using
  AArch64, this value will be reported in the ISS field of ESR_ELx.
  
  See KVM_CAP_VCPU_EVENTS for more details.
+
  8.20 KVM_CAP_HYPERV_SEND_IPI
+----------------------------
  
-Architectures: x86
+:Architectures: x86
  
  This capability indicates that KVM supports paravirtualized Hyper-V IPI send
  hypercalls:
  HvCallSendSyntheticClusterIpi, HvCallSendSyntheticClusterIpiEx.
+
  8.21 KVM_CAP_HYPERV_DIRECT_TLBFLUSH
+-----------------------------------
  
-Architecture: x86
+:Architecture: x86
  
  This capability indicates that KVM running on top of Hyper-V hypervisor
  enables Direct TLB flush for its guests meaning that TLB flush
diff --git a/Documentation/virt/kvm/arm/hyp-abi.txt b/Documentation/virt/kvm/arm/hyp-abi.rst

similarity index 79%

rename from Documentation/virt/kvm/arm/hyp-abi.txt

rename to Documentation/virt/kvm/arm/hyp-abi.rst

index a20a0bee268d3c3b834f71dc8956d57d27718adf..d1fc27d848e95d75ec5bada94d93f924fdf4c70c 100644 (file)
--- a/Documentation/virt/kvm/arm/hyp-abi.txt
+++ b/Documentation/virt/kvm/arm/hyp-abi.rst
@@ -1,4 +1,8 @@
-* Internal ABI between the kernel and HYP
+.. SPDX-License-Identifier: GPL-2.0
+
+=======================================
+Internal ABI between the kernel and HYP
+=======================================
  
  This file documents the interaction between the Linux kernel and the
  hypervisor layer when running Linux as a hypervisor (for example
@@ -19,25 +23,31 @@ and only act on individual CPUs.
  Unless specified otherwise, any built-in hypervisor must implement
  these functions (see arch/arm{,64}/include/asm/virt.h):
  
-* r0/x0 = HVC_SET_VECTORS
-  r1/x1 = vectors
+* ::
+
+    r0/x0 = HVC_SET_VECTORS
+    r1/x1 = vectors
  
    Set HVBAR/VBAR_EL2 to 'vectors' to enable a hypervisor. 'vectors'
    must be a physical address, and respect the alignment requirements
    of the architecture. Only implemented by the initial stubs, not by
    Linux hypervisors.
  
-* r0/x0 = HVC_RESET_VECTORS
+* ::
+
+    r0/x0 = HVC_RESET_VECTORS
  
    Turn HYP/EL2 MMU off, and reset HVBAR/VBAR_EL2 to the initials
    stubs' exception vector value. This effectively disables an existing
    hypervisor.
  
-* r0/x0 = HVC_SOFT_RESTART
-  r1/x1 = restart address
-  x2 = x0's value when entering the next payload (arm64)
-  x3 = x1's value when entering the next payload (arm64)
-  x4 = x2's value when entering the next payload (arm64)
+* ::
+
+    r0/x0 = HVC_SOFT_RESTART
+    r1/x1 = restart address
+    x2 = x0's value when entering the next payload (arm64)
+    x3 = x1's value when entering the next payload (arm64)
+    x4 = x2's value when entering the next payload (arm64)
  
    Mask all exceptions, disable the MMU, move the arguments into place
    (arm64 only), and jump to the restart address while at HYP/EL2. This
diff --git a/Documentation/virt/kvm/arm/index.rst b/Documentation/virt/kvm/arm/index.rst

new file mode 100644 (file)

index 0000000..3e2b2ab
--- /dev/null
+++ b/Documentation/virt/kvm/arm/index.rst
@@ -0,0 +1,12 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===
+ARM
+===
+
+.. toctree::
+   :maxdepth: 2
+
+   hyp-abi
+   psci
+   pvtime
diff --git a/Documentation/virt/kvm/arm/psci.txt b/Documentation/virt/kvm/arm/psci.rst

similarity index 60%

rename from Documentation/virt/kvm/arm/psci.txt

rename to Documentation/virt/kvm/arm/psci.rst

index 559586fc9d379921c98a87841792055c15573082..d52c2e83b5b8d16c15c1431605f54c91d2d82a10 100644 (file)
--- a/Documentation/virt/kvm/arm/psci.txt
+++ b/Documentation/virt/kvm/arm/psci.rst
@@ -1,3 +1,9 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=========================================
+Power State Coordination Interface (PSCI)
+=========================================
+
  KVM implements the PSCI (Power State Coordination Interface)
  specification in order to provide services such as CPU on/off, reset
  and power-off to the guest.
@@ -30,32 +36,42 @@ The following register is defined:
    - Affects the whole VM (even if the register view is per-vcpu)
  
  * KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
-  Holds the state of the firmware support to mitigate CVE-2017-5715, as
-  offered by KVM to the guest via a HVC call. The workaround is described
-  under SMCCC_ARCH_WORKAROUND_1 in [1].
+    Holds the state of the firmware support to mitigate CVE-2017-5715, as
+    offered by KVM to the guest via a HVC call. The workaround is described
+    under SMCCC_ARCH_WORKAROUND_1 in [1].
+
    Accepted values are:
-    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL: KVM does not offer
+
+    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_AVAIL:
+      KVM does not offer
        firmware support for the workaround. The mitigation status for the
        guest is unknown.
-    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL: The workaround HVC call is
+    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_AVAIL:
+      The workaround HVC call is
        available to the guest and required for the mitigation.
-    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_REQUIRED: The workaround HVC call
+    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1_NOT_REQUIRED:
+      The workaround HVC call
        is available to the guest, but it is not needed on this VCPU.
  
  * KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
-  Holds the state of the firmware support to mitigate CVE-2018-3639, as
-  offered by KVM to the guest via a HVC call. The workaround is described
-  under SMCCC_ARCH_WORKAROUND_2 in [1].
+    Holds the state of the firmware support to mitigate CVE-2018-3639, as
+    offered by KVM to the guest via a HVC call. The workaround is described
+    under SMCCC_ARCH_WORKAROUND_2 in [1]_.
+
    Accepted values are:
-    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL: A workaround is not
+
+    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL:
+      A workaround is not
        available. KVM does not offer firmware support for the workaround.
-    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN: The workaround state is
+    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_UNKNOWN:
+      The workaround state is
        unknown. KVM does not offer firmware support for the workaround.
-    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL: The workaround is available,
+    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_AVAIL:
+      The workaround is available,
        and can be disabled by a vCPU. If
        KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_ENABLED is set, it is active for
        this vCPU.
-    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED: The workaround is
-      always active on this vCPU or it is not needed.
+    KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_REQUIRED:
+      The workaround is always active on this vCPU or it is not needed.
  
-[1] https://developer.arm.com/-/media/developer/pdf/ARM_DEN_0070A_Firmware_interfaces_for_mitigating_CVE-2017-5715.pdf
+.. [1] https://developer.arm.com/-/media/developer/pdf/ARM_DEN_0070A_Firmware_interfaces_for_mitigating_CVE-2017-5715.pdf
diff --git a/Documentation/virt/kvm/devices/arm-vgic-its.txt b/Documentation/virt/kvm/devices/arm-vgic-its.rst

similarity index 71%

rename from Documentation/virt/kvm/devices/arm-vgic-its.txt

rename to Documentation/virt/kvm/devices/arm-vgic-its.rst

index eeaa95b893a89b7a94f66ad7750b88de69090ccc..6c304fd2b1b488a786a213944fb9846d8c191e52 100644 (file)
--- a/Documentation/virt/kvm/devices/arm-vgic-its.txt
+++ b/Documentation/virt/kvm/devices/arm-vgic-its.rst
@@ -1,3 +1,6 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===============================================
  ARM Virtual Interrupt Translation Service (ITS)
  ===============================================
  
@@ -12,22 +15,32 @@ There can be multiple ITS controllers per guest, each of them has to have
  a separate, non-overlapping MMIO region.
  
  
-Groups:
-  KVM_DEV_ARM_VGIC_GRP_ADDR
+Groups
+======
+
+KVM_DEV_ARM_VGIC_GRP_ADDR
+-------------------------
+
    Attributes:
      KVM_VGIC_ITS_ADDR_TYPE (rw, 64-bit)
        Base address in the guest physical address space of the GICv3 ITS
        control register frame.
        This address needs to be 64K aligned and the region covers 128K.
+
    Errors:
-    -E2BIG:  Address outside of addressable IPA range
-    -EINVAL: Incorrectly aligned address
-    -EEXIST: Address already configured
-    -EFAULT: Invalid user pointer for attr->addr.
-    -ENODEV: Incorrect attribute or the ITS is not supported.
  
+    =======  =================================================
+    -E2BIG   Address outside of addressable IPA range
+    -EINVAL  Incorrectly aligned address
+    -EEXIST  Address already configured
+    -EFAULT  Invalid user pointer for attr->addr.
+    -ENODEV  Incorrect attribute or the ITS is not supported.
+    =======  =================================================
+
+
+KVM_DEV_ARM_VGIC_GRP_CTRL
+-------------------------
  
-  KVM_DEV_ARM_VGIC_GRP_CTRL
    Attributes:
      KVM_DEV_ARM_VGIC_CTRL_INIT
        request the initialization of the ITS, no additional parameter in
@@ -58,16 +71,21 @@ Groups:
        "ITS Restore Sequence".
  
    Errors:
-    -ENXIO:  ITS not properly configured as required prior to setting
+
+    =======  ==========================================================
+     -ENXIO  ITS not properly configured as required prior to setting
               this attribute
-    -ENOMEM: Memory shortage when allocating ITS internal data
-    -EINVAL: Inconsistent restored data
-    -EFAULT: Invalid guest ram access
-    -EBUSY:  One or more VCPUS are running
-    -EACCES: The virtual ITS is backed by a physical GICv4 ITS, and the
+    -ENOMEM  Memory shortage when allocating ITS internal data
+    -EINVAL  Inconsistent restored data
+    -EFAULT  Invalid guest ram access
+    -EBUSY   One or more VCPUS are running
+    -EACCES  The virtual ITS is backed by a physical GICv4 ITS, and the
              state is not available
+    =======  ==========================================================
+
+KVM_DEV_ARM_VGIC_GRP_ITS_REGS
+-----------------------------
  
-  KVM_DEV_ARM_VGIC_GRP_ITS_REGS
    Attributes:
        The attr field of kvm_device_attr encodes the offset of the
        ITS register, relative to the ITS control frame base address
@@ -78,6 +96,7 @@ Groups:
        be accessed with full length.
  
        Writes to read-only registers are ignored by the kernel except for:
+
        - GITS_CREADR. It must be restored otherwise commands in the queue
          will be re-executed after restoring CWRITER. GITS_CREADR must be
          restored before restoring the GITS_CTLR which is likely to enable the
@@ -91,30 +110,36 @@ Groups:
  
        For other registers, getting or setting a register has the same
        effect as reading/writing the register on real hardware.
+
    Errors:
-    -ENXIO: Offset does not correspond to any supported register
-    -EFAULT: Invalid user pointer for attr->addr
-    -EINVAL: Offset is not 64-bit aligned
-    -EBUSY: one or more VCPUS are running
  
- ITS Restore Sequence:
- -------------------------
+    =======  ====================================================
+    -ENXIO   Offset does not correspond to any supported register
+    -EFAULT  Invalid user pointer for attr->addr
+    -EINVAL  Offset is not 64-bit aligned
+    -EBUSY   one or more VCPUS are running
+    =======  ====================================================
+
+ITS Restore Sequence:
+---------------------
  
  The following ordering must be followed when restoring the GIC and the ITS:
+
  a) restore all guest memory and create vcpus
  b) restore all redistributors
  c) provide the ITS base address
     (KVM_DEV_ARM_VGIC_GRP_ADDR)
  d) restore the ITS in the following order:
-   1. Restore GITS_CBASER
-   2. Restore all other GITS_ registers, except GITS_CTLR!
-   3. Load the ITS table data (KVM_DEV_ARM_ITS_RESTORE_TABLES)
-   4. Restore GITS_CTLR
+
+     1. Restore GITS_CBASER
+     2. Restore all other ``GITS_`` registers, except GITS_CTLR!
+     3. Load the ITS table data (KVM_DEV_ARM_ITS_RESTORE_TABLES)
+     4. Restore GITS_CTLR
  
  Then vcpus can be started.
  
- ITS Table ABI REV0:
- -------------------
+ITS Table ABI REV0:
+-------------------
  
   Revision 0 of the ABI only supports the features of a virtual GICv3, and does
   not support a virtual GICv4 with support for direct injection of virtual
@@ -125,12 +150,13 @@ Then vcpus can be started.
   entries in the collection are listed in no particular order.
   All entries are 8 bytes.
  
- Device Table Entry (DTE):
+ Device Table Entry (DTE)::
  
- bits:     | 63| 62 ... 49 | 48 ... 5 | 4 ... 0 |
- values:   | V |   next    | ITT_addr |  Size   |
+   bits:     | 63| 62 ... 49 | 48 ... 5 | 4 ... 0 |
+   values:   | V |   next    | ITT_addr |  Size   |
+
+ where:
  
- where;
   - V indicates whether the entry is valid. If not, other fields
     are not meaningful.
   - next: equals to 0 if this entry is the last one; otherwise it
@@ -140,32 +166,34 @@ Then vcpus can be started.
   - Size specifies the supported number of bits for the EventID,
     minus one
  
- Collection Table Entry (CTE):
+ Collection Table Entry (CTE)::
  
- bits:     | 63| 62 ..  52  | 51 ... 16 | 15  ...   0 |
- values:   | V |    RES0    |  RDBase   |    ICID     |
+   bits:     | 63| 62 ..  52  | 51 ... 16 | 15  ...   0 |
+   values:   | V |    RES0    |  RDBase   |    ICID     |
  
   where:
+
   - V indicates whether the entry is valid. If not, other fields are
     not meaningful.
   - RES0: reserved field with Should-Be-Zero-or-Preserved behavior.
   - RDBase is the PE number (GICR_TYPER.Processor_Number semantic),
   - ICID is the collection ID
  
- Interrupt Translation Entry (ITE):
+ Interrupt Translation Entry (ITE)::
  
- bits:     | 63 ... 48 | 47 ... 16 | 15 ... 0 |
- values:   |    next   |   pINTID  |  ICID    |
+   bits:     | 63 ... 48 | 47 ... 16 | 15 ... 0 |
+   values:   |    next   |   pINTID  |  ICID    |
  
   where:
+
   - next: equals to 0 if this entry is the last one; otherwise it corresponds
     to the EventID offset to the next ITE capped by 2^16 -1.
   - pINTID is the physical LPI ID; if zero, it means the entry is not valid
     and other fields are not meaningful.
   - ICID is the collection ID
  
- ITS Reset State:
- ----------------
+ITS Reset State:
+----------------
  
  RESET returns the ITS to the same state that it was when first created and
  initialized. When the RESET command returns, the following things are
diff --git a/Documentation/virt/kvm/devices/arm-vgic-v3.txt b/Documentation/virt/kvm/devices/arm-vgic-v3.rst

similarity index 77%

rename from Documentation/virt/kvm/devices/arm-vgic-v3.txt

rename to Documentation/virt/kvm/devices/arm-vgic-v3.rst

index ff290b43c8e513175990d76337653299f87d7a93..5dd3bff519783c36381877cca0c27286ec2b9c01 100644 (file)
--- a/Documentation/virt/kvm/devices/arm-vgic-v3.txt
+++ b/Documentation/virt/kvm/devices/arm-vgic-v3.rst
@@ -1,9 +1,12 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================================================
  ARM Virtual Generic Interrupt Controller v3 and later (VGICv3)
  ==============================================================
  
  
  Device types supported:
-  KVM_DEV_TYPE_ARM_VGIC_V3     ARM Generic Interrupt Controller v3.0
+  - KVM_DEV_TYPE_ARM_VGIC_V3     ARM Generic Interrupt Controller v3.0
  
  Only one VGIC instance may be instantiated through this API.  The created VGIC
  will act as the VM interrupt controller, requiring emulated user-space devices
@@ -15,7 +18,8 @@ Creating a guest GICv3 device requires a host GICv3 as well.
  
  Groups:
    KVM_DEV_ARM_VGIC_GRP_ADDR
-  Attributes:
+   Attributes:
+
      KVM_VGIC_V3_ADDR_TYPE_DIST (rw, 64-bit)
        Base address in the guest physical address space of the GICv3 distributor
        register mappings. Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
@@ -29,21 +33,25 @@ Groups:
        This address needs to be 64K aligned.
  
      KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION (rw, 64-bit)
-      The attribute data pointed to by kvm_device_attr.addr is a __u64 value:
-      bits:     | 63   ....  52  |  51   ....   16 | 15 - 12  |11 - 0
-      values:   |     count      |       base      |  flags   | index
+      The attribute data pointed to by kvm_device_attr.addr is a __u64 value::
+
+        bits:     | 63   ....  52  |  51   ....   16 | 15 - 12  |11 - 0
+        values:   |     count      |       base      |  flags   | index
+
        - index encodes the unique redistributor region index
        - flags: reserved for future use, currently 0
        - base field encodes bits [51:16] of the guest physical base address
          of the first redistributor in the region.
        - count encodes the number of redistributors in the region. Must be
          greater than 0.
+
        There are two 64K pages for each redistributor in the region and
        redistributors are laid out contiguously within the region. Regions
        are filled with redistributors in the index order. The sum of all
        region count fields must be greater than or equal to the number of
        VCPUs. Redistributor regions must be registered in the incremental
        index order, starting from index 0.
+
        The characteristics of a specific redistributor region can be read
        by presetting the index field in the attr data.
        Only valid for KVM_DEV_TYPE_ARM_VGIC_V3.
@@ -52,23 +60,27 @@ Groups:
    KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION attributes.
  
    Errors:
-    -E2BIG:  Address outside of addressable IPA range
-    -EINVAL: Incorrectly aligned address, bad redistributor region
+
+    =======  =============================================================
+    -E2BIG   Address outside of addressable IPA range
+    -EINVAL  Incorrectly aligned address, bad redistributor region
               count/index, mixed redistributor region attribute usage
-    -EEXIST: Address already configured
-    -ENOENT: Attempt to read the characteristics of a non existing
+    -EEXIST  Address already configured
+    -ENOENT  Attempt to read the characteristics of a non existing
               redistributor region
-    -ENXIO:  The group or attribute is unknown/unsupported for this device
+    -ENXIO   The group or attribute is unknown/unsupported for this device
               or hardware support is missing.
-    -EFAULT: Invalid user pointer for attr->addr.
+    -EFAULT  Invalid user pointer for attr->addr.
+    =======  =============================================================
+
  
+  KVM_DEV_ARM_VGIC_GRP_DIST_REGS, KVM_DEV_ARM_VGIC_GRP_REDIST_REGS
+   Attributes:
  
-  KVM_DEV_ARM_VGIC_GRP_DIST_REGS
-  KVM_DEV_ARM_VGIC_GRP_REDIST_REGS
-  Attributes:
-    The attr field of kvm_device_attr encodes two values:
-    bits:     | 63   ....  32  |  31   ....    0 |
-    values:   |      mpidr     |      offset     |
+    The attr field of kvm_device_attr encodes two values::
+
+      bits:     | 63   ....  32  |  31   ....    0 |
+      values:   |      mpidr     |      offset     |
  
      All distributor regs are (rw, 32-bit) and kvm_device_attr.addr points to a
      __u32 value.  64-bit registers must be accessed by separately accessing the
@@ -93,7 +105,8 @@ Groups:
      redistributor is accessed.  The mpidr is ignored for the distributor.
  
      The mpidr encoding is based on the affinity information in the
-    architecture defined MPIDR, and the field is encoded as follows:
+    architecture defined MPIDR, and the field is encoded as follows::
+
        | 63 .... 56 | 55 .... 48 | 47 .... 40 | 39 .... 32 |
        |    Aff3    |    Aff2    |    Aff1    |    Aff0    |
  
@@ -148,24 +161,30 @@ Groups:
      ignored.
  
    Errors:
-    -ENXIO: Getting or setting this register is not yet supported
-    -EBUSY: One or more VCPUs are running
+
+    ======  =====================================================
+    -ENXIO  Getting or setting this register is not yet supported
+    -EBUSY  One or more VCPUs are running
+    ======  =====================================================
  
  
    KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS
-  Attributes:
-    The attr field of kvm_device_attr encodes two values:
-    bits:     | 63      ....       32 | 31  ....  16 | 15  ....  0 |
-    values:   |         mpidr         |      RES     |    instr    |
+   Attributes:
+
+    The attr field of kvm_device_attr encodes two values::
+
+      bits:     | 63      ....       32 | 31  ....  16 | 15  ....  0 |
+      values:   |         mpidr         |      RES     |    instr    |
  
      The mpidr field encodes the CPU ID based on the affinity information in the
-    architecture defined MPIDR, and the field is encoded as follows:
+    architecture defined MPIDR, and the field is encoded as follows::
+
        | 63 .... 56 | 55 .... 48 | 47 .... 40 | 39 .... 32 |
        |    Aff3    |    Aff2    |    Aff1    |    Aff0    |
  
      The instr field encodes the system register to access based on the fields
      defined in the A64 instruction set encoding for system register access
-    (RES means the bits are reserved for future use and should be zero):
+    (RES means the bits are reserved for future use and should be zero)::
  
        | 15 ... 14 | 13 ... 11 | 10 ... 7 | 6 ... 3 | 2 ... 0 |
        |   Op 0    |    Op1    |    CRn   |   CRm   |   Op2   |
@@ -178,26 +197,35 @@ Groups:
  
      CPU interface registers access is not implemented for AArch32 mode.
      Error -ENXIO is returned when accessed in AArch32 mode.
+
    Errors:
-    -ENXIO: Getting or setting this register is not yet supported
-    -EBUSY: VCPU is running
-    -EINVAL: Invalid mpidr or register value supplied
+
+    =======  =====================================================
+    -ENXIO   Getting or setting this register is not yet supported
+    -EBUSY   VCPU is running
+    -EINVAL  Invalid mpidr or register value supplied
+    =======  =====================================================
  
  
    KVM_DEV_ARM_VGIC_GRP_NR_IRQS
-  Attributes:
+   Attributes:
+
      A value describing the number of interrupts (SGI, PPI and SPI) for
      this GIC instance, ranging from 64 to 1024, in increments of 32.
  
      kvm_device_attr.addr points to a __u32 value.
  
    Errors:
-    -EINVAL: Value set is out of the expected range
-    -EBUSY: Value has already be set.
+
+    =======  ======================================
+    -EINVAL  Value set is out of the expected range
+    -EBUSY   Value has already be set.
+    =======  ======================================
  
  
    KVM_DEV_ARM_VGIC_GRP_CTRL
-  Attributes:
+   Attributes:
+
      KVM_DEV_ARM_VGIC_CTRL_INIT
        request the initialization of the VGIC, no additional parameter in
        kvm_device_attr.addr.
@@ -205,20 +233,26 @@ Groups:
        save all LPI pending bits into guest RAM pending tables.
  
        The first kB of the pending table is not altered by this operation.
+
    Errors:
-    -ENXIO: VGIC not properly configured as required prior to calling
-     this attribute
-    -ENODEV: no online VCPU
-    -ENOMEM: memory shortage when allocating vgic internal data
-    -EFAULT: Invalid guest ram access
-    -EBUSY:  One or more VCPUS are running
+
+    =======  ========================================================
+    -ENXIO   VGIC not properly configured as required prior to calling
+             this attribute
+    -ENODEV  no online VCPU
+    -ENOMEM  memory shortage when allocating vgic internal data
+    -EFAULT  Invalid guest ram access
+    -EBUSY   One or more VCPUS are running
+    =======  ========================================================
  
  
    KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO
-  Attributes:
-    The attr field of kvm_device_attr encodes the following values:
-    bits:     | 63      ....       32 | 31   ....    10 | 9  ....  0 |
-    values:   |         mpidr         |      info       |   vINTID   |
+   Attributes:
+
+    The attr field of kvm_device_attr encodes the following values::
+
+      bits:     | 63      ....       32 | 31   ....    10 | 9  ....  0 |
+      values:   |         mpidr         |      info       |   vINTID   |
  
      The vINTID specifies which set of IRQs is reported on.
  
@@ -228,6 +262,7 @@ Groups:
        VGIC_LEVEL_INFO_LINE_LEVEL:
         Get/Set the input level of the IRQ line for a set of 32 contiguously
         numbered interrupts.
+
         vINTID must be a multiple of 32.
  
         kvm_device_attr.addr points to a __u32 value which will contain a
@@ -243,9 +278,14 @@ Groups:
      reported with the same value regardless of the mpidr specified.
  
      The mpidr field encodes the CPU ID based on the affinity information in the
-    architecture defined MPIDR, and the field is encoded as follows:
+    architecture defined MPIDR, and the field is encoded as follows::
+
        | 63 .... 56 | 55 .... 48 | 47 .... 40 | 39 .... 32 |
        |    Aff3    |    Aff2    |    Aff1    |    Aff0    |
+
    Errors:
-    -EINVAL: vINTID is not multiple of 32 or
-     info field is not VGIC_LEVEL_INFO_LINE_LEVEL
+
+    =======  =============================================
+    -EINVAL  vINTID is not multiple of 32 or info field is
+            not VGIC_LEVEL_INFO_LINE_LEVEL
+    =======  =============================================
diff --git a/Documentation/virt/kvm/devices/arm-vgic.txt b/Documentation/virt/kvm/devices/arm-vgic.rst

similarity index 66%

rename from Documentation/virt/kvm/devices/arm-vgic.txt

rename to Documentation/virt/kvm/devices/arm-vgic.rst

index 97b6518148f87befcc2dab5fb250459607154525..40bdeea1d86e75b9b912f893ff8bf95b892cf839 100644 (file)
--- a/Documentation/virt/kvm/devices/arm-vgic.txt
+++ b/Documentation/virt/kvm/devices/arm-vgic.rst
@@ -1,8 +1,12 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==================================================
  ARM Virtual Generic Interrupt Controller v2 (VGIC)
  ==================================================
  
  Device types supported:
-  KVM_DEV_TYPE_ARM_VGIC_V2     ARM Generic Interrupt Controller v2.0
+
+  - KVM_DEV_TYPE_ARM_VGIC_V2     ARM Generic Interrupt Controller v2.0
  
  Only one VGIC instance may be instantiated through either this API or the
  legacy KVM_CREATE_IRQCHIP API.  The created VGIC will act as the VM interrupt
@@ -17,7 +21,8 @@ create both a GICv3 and GICv2 device on the same VM.
  
  Groups:
    KVM_DEV_ARM_VGIC_GRP_ADDR
-  Attributes:
+   Attributes:
+
      KVM_VGIC_V2_ADDR_TYPE_DIST (rw, 64-bit)
        Base address in the guest physical address space of the GIC distributor
        register mappings. Only valid for KVM_DEV_TYPE_ARM_VGIC_V2.
@@ -27,19 +32,25 @@ Groups:
        Base address in the guest physical address space of the GIC virtual cpu
        interface register mappings. Only valid for KVM_DEV_TYPE_ARM_VGIC_V2.
        This address needs to be 4K aligned and the region covers 4 KByte.
+
    Errors:
-    -E2BIG:  Address outside of addressable IPA range
-    -EINVAL: Incorrectly aligned address
-    -EEXIST: Address already configured
-    -ENXIO:  The group or attribute is unknown/unsupported for this device
+
+    =======  =============================================================
+    -E2BIG   Address outside of addressable IPA range
+    -EINVAL  Incorrectly aligned address
+    -EEXIST  Address already configured
+    -ENXIO   The group or attribute is unknown/unsupported for this device
               or hardware support is missing.
-    -EFAULT: Invalid user pointer for attr->addr.
+    -EFAULT  Invalid user pointer for attr->addr.
+    =======  =============================================================
  
    KVM_DEV_ARM_VGIC_GRP_DIST_REGS
-  Attributes:
-    The attr field of kvm_device_attr encodes two values:
-    bits:     | 63   ....  40 | 39 ..  32  |  31   ....    0 |
-    values:   |    reserved   | vcpu_index |      offset     |
+   Attributes:
+
+    The attr field of kvm_device_attr encodes two values::
+
+      bits:     | 63   ....  40 | 39 ..  32  |  31   ....    0 |
+      values:   |    reserved   | vcpu_index |      offset     |
  
      All distributor regs are (rw, 32-bit)
  
@@ -58,16 +69,22 @@ Groups:
      KVM_DEV_ARM_VGIC_GRP_DIST_REGS and KVM_DEV_ARM_VGIC_GRP_CPU_REGS) to ensure
      the expected behavior. Unless GICD_IIDR has been set from userspace, writes
      to the interrupt group registers (GICD_IGROUPR) are ignored.
+
    Errors:
-    -ENXIO: Getting or setting this register is not yet supported
-    -EBUSY: One or more VCPUs are running
-    -EINVAL: Invalid vcpu_index supplied
+
+    =======  =====================================================
+    -ENXIO   Getting or setting this register is not yet supported
+    -EBUSY   One or more VCPUs are running
+    -EINVAL  Invalid vcpu_index supplied
+    =======  =====================================================
  
    KVM_DEV_ARM_VGIC_GRP_CPU_REGS
-  Attributes:
-    The attr field of kvm_device_attr encodes two values:
-    bits:     | 63   ....  40 | 39 ..  32  |  31   ....    0 |
-    values:   |    reserved   | vcpu_index |      offset     |
+   Attributes:
+
+    The attr field of kvm_device_attr encodes two values::
+
+      bits:     | 63   ....  40 | 39 ..  32  |  31   ....    0 |
+      values:   |    reserved   | vcpu_index |      offset     |
  
      All CPU interface regs are (rw, 32-bit)
  
@@ -101,27 +118,39 @@ Groups:
      value left by 3 places to obtain the actual priority mask level.
  
    Errors:
-    -ENXIO: Getting or setting this register is not yet supported
-    -EBUSY: One or more VCPUs are running
-    -EINVAL: Invalid vcpu_index supplied
+
+    =======  =====================================================
+    -ENXIO   Getting or setting this register is not yet supported
+    -EBUSY   One or more VCPUs are running
+    -EINVAL  Invalid vcpu_index supplied
+    =======  =====================================================
  
    KVM_DEV_ARM_VGIC_GRP_NR_IRQS
-  Attributes:
+   Attributes:
+
      A value describing the number of interrupts (SGI, PPI and SPI) for
      this GIC instance, ranging from 64 to 1024, in increments of 32.
  
    Errors:
-    -EINVAL: Value set is out of the expected range
-    -EBUSY: Value has already be set, or GIC has already been initialized
-            with default values.
+
+    =======  =============================================================
+    -EINVAL  Value set is out of the expected range
+    -EBUSY   Value has already be set, or GIC has already been initialized
+             with default values.
+    =======  =============================================================
  
    KVM_DEV_ARM_VGIC_GRP_CTRL
-  Attributes:
+   Attributes:
+
      KVM_DEV_ARM_VGIC_CTRL_INIT
        request the initialization of the VGIC or ITS, no additional parameter
        in kvm_device_attr.addr.
+
    Errors:
-    -ENXIO: VGIC not properly configured as required prior to calling
-     this attribute
-    -ENODEV: no online VCPU
-    -ENOMEM: memory shortage when allocating vgic internal data
+
+    =======  =========================================================
+    -ENXIO   VGIC not properly configured as required prior to calling
+             this attribute
+    -ENODEV  no online VCPU
+    -ENOMEM  memory shortage when allocating vgic internal data
+    =======  =========================================================
diff --git a/Documentation/virt/kvm/devices/index.rst b/Documentation/virt/kvm/devices/index.rst

new file mode 100644 (file)

index 0000000..192cda7
--- /dev/null
+++ b/Documentation/virt/kvm/devices/index.rst
@@ -0,0 +1,19 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=======
+Devices
+=======
+
+.. toctree::
+   :maxdepth: 2
+
+   arm-vgic-its
+   arm-vgic
+   arm-vgic-v3
+   mpic
+   s390_flic
+   vcpu
+   vfio
+   vm
+   xics
+   xive
diff --git a/Documentation/virt/kvm/devices/mpic.txt b/Documentation/virt/kvm/devices/mpic.rst

similarity index 91%

rename from Documentation/virt/kvm/devices/mpic.txt

rename to Documentation/virt/kvm/devices/mpic.rst

index 8257397adc3cc18a0733f58ee7c5b8a89bb0dfbe..55cefe030d41471e20c967adf076d71d13127720 100644 (file)
--- a/Documentation/virt/kvm/devices/mpic.txt
+++ b/Documentation/virt/kvm/devices/mpic.rst
@@ -1,9 +1,13 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=========================
  MPIC interrupt controller
  =========================
  
  Device types supported:
-  KVM_DEV_TYPE_FSL_MPIC_20     Freescale MPIC v2.0
-  KVM_DEV_TYPE_FSL_MPIC_42     Freescale MPIC v4.2
+
+  - KVM_DEV_TYPE_FSL_MPIC_20     Freescale MPIC v2.0
+  - KVM_DEV_TYPE_FSL_MPIC_42     Freescale MPIC v4.2
  
  Only one MPIC instance, of any type, may be instantiated.  The created
  MPIC will act as the system interrupt controller, connecting to each
@@ -11,7 +15,8 @@ vcpu's interrupt inputs.
  
  Groups:
    KVM_DEV_MPIC_GRP_MISC
-  Attributes:
+   Attributes:
+
      KVM_DEV_MPIC_BASE_ADDR (rw, 64-bit)
        Base address of the 256 KiB MPIC register space.  Must be
        naturally aligned.  A value of zero disables the mapping.
diff --git a/Documentation/virt/kvm/devices/s390_flic.txt b/Documentation/virt/kvm/devices/s390_flic.rst

similarity index 87%

rename from Documentation/virt/kvm/devices/s390_flic.txt

rename to Documentation/virt/kvm/devices/s390_flic.rst

index a4e20a09017468b2e5a97e7782ad74aab1431de5..954190da7d0413a9ff3b9a6a3f8c5c29760c743e 100644 (file)
--- a/Documentation/virt/kvm/devices/s390_flic.txt
+++ b/Documentation/virt/kvm/devices/s390_flic.rst
@@ -1,3 +1,6 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================================
  FLIC (floating interrupt controller)
  ====================================
  
@@ -31,8 +34,10 @@ Groups:
      Copies all floating interrupts into a buffer provided by userspace.
      When the buffer is too small it returns -ENOMEM, which is the indication
      for userspace to try again with a bigger buffer.
+
      -ENOBUFS is returned when the allocation of a kernelspace buffer has
      failed.
+
      -EFAULT is returned when copying data to userspace failed.
      All interrupts remain pending, i.e. are not deleted from the list of
      currently pending interrupts.
@@ -60,38 +65,41 @@ Groups:
  
    KVM_DEV_FLIC_ADAPTER_REGISTER
      Register an I/O adapter interrupt source. Takes a kvm_s390_io_adapter
-    describing the adapter to register:
+    describing the adapter to register::
  
-struct kvm_s390_io_adapter {
-       __u32 id;
-       __u8 isc;
-       __u8 maskable;
-       __u8 swap;
-       __u8 flags;
-};
+       struct kvm_s390_io_adapter {
+               __u32 id;
+               __u8 isc;
+               __u8 maskable;
+               __u8 swap;
+               __u8 flags;
+       };
  
     id contains the unique id for the adapter, isc the I/O interruption subclass
     to use, maskable whether this adapter may be masked (interrupts turned off),
     swap whether the indicators need to be byte swapped, and flags contains
     further characteristics of the adapter.
+
     Currently defined values for 'flags' are:
+
     - KVM_S390_ADAPTER_SUPPRESSIBLE: adapter is subject to AIS
       (adapter-interrupt-suppression) facility. This flag only has an effect if
       the AIS capability is enabled.
+
     Unknown flag values are ignored.
  
  
    KVM_DEV_FLIC_ADAPTER_MODIFY
      Modifies attributes of an existing I/O adapter interrupt source. Takes
-    a kvm_s390_io_adapter_req specifying the adapter and the operation:
+    a kvm_s390_io_adapter_req specifying the adapter and the operation::
  
-struct kvm_s390_io_adapter_req {
-       __u32 id;
-       __u8 type;
-       __u8 mask;
-       __u16 pad0;
-       __u64 addr;
-};
+       struct kvm_s390_io_adapter_req {
+               __u32 id;
+               __u8 type;
+               __u8 mask;
+               __u16 pad0;
+               __u64 addr;
+       };
  
      id specifies the adapter and type the operation. The supported operations
      are:
@@ -103,8 +111,9 @@ struct kvm_s390_io_adapter_req {
        perform a gmap translation for the guest address provided in addr,
        pin a userspace page for the translated address and add it to the
        list of mappings
-      Note: A new mapping will be created unconditionally; therefore,
-            the calling code should avoid making duplicate mappings.
+
+      .. note:: A new mapping will be created unconditionally; therefore,
+               the calling code should avoid making duplicate mappings.
  
      KVM_S390_IO_ADAPTER_UNMAP
        release a userspace page for the translated address specified in addr
@@ -112,16 +121,17 @@ struct kvm_s390_io_adapter_req {
  
    KVM_DEV_FLIC_AISM
      modify the adapter-interruption-suppression mode for a given isc if the
-    AIS capability is enabled. Takes a kvm_s390_ais_req describing:
+    AIS capability is enabled. Takes a kvm_s390_ais_req describing::
  
-struct kvm_s390_ais_req {
-       __u8 isc;
-       __u16 mode;
-};
+       struct kvm_s390_ais_req {
+               __u8 isc;
+               __u16 mode;
+       };
  
      isc contains the target I/O interruption subclass, mode the target
      adapter-interruption-suppression mode. The following modes are
      currently supported:
+
      - KVM_S390_AIS_MODE_ALL: ALL-Interruptions Mode, i.e. airq injection
        is always allowed;
      - KVM_S390_AIS_MODE_SINGLE: SINGLE-Interruption Mode, i.e. airq
@@ -139,12 +149,12 @@ struct kvm_s390_ais_req {
  
    KVM_DEV_FLIC_AISM_ALL
      Gets or sets the adapter-interruption-suppression mode for all ISCs. Takes
-    a kvm_s390_ais_all describing:
+    a kvm_s390_ais_all describing::
  
-struct kvm_s390_ais_all {
-       __u8 simm; /* Single-Interruption-Mode mask */
-       __u8 nimm; /* No-Interruption-Mode mask *
-};
+       struct kvm_s390_ais_all {
+              __u8 simm; /* Single-Interruption-Mode mask */
+              __u8 nimm; /* No-Interruption-Mode mask *
+       };
  
      simm contains Single-Interruption-Mode mask for all ISCs, nimm contains
      No-Interruption-Mode mask for all ISCs. Each bit in simm and nimm corresponds
@@ -159,5 +169,5 @@ ENXIO, as specified in the API documentation). It is not possible to conclude
  that a FLIC operation is unavailable based on the error code resulting from a
  usage attempt.
  
-Note: The KVM_DEV_FLIC_CLEAR_IO_IRQ ioctl will return EINVAL in case a zero
-schid is specified.
+.. note:: The KVM_DEV_FLIC_CLEAR_IO_IRQ ioctl will return EINVAL in case a
+         zero schid is specified.
diff --git a/Documentation/virt/kvm/devices/vcpu.rst b/Documentation/virt/kvm/devices/vcpu.rst

new file mode 100644 (file)

index 0000000..9963e68
--- /dev/null
+++ b/Documentation/virt/kvm/devices/vcpu.rst
@@ -0,0 +1,114 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+======================
+Generic vcpu interface
+======================
+
+The virtual cpu "device" also accepts the ioctls KVM_SET_DEVICE_ATTR,
+KVM_GET_DEVICE_ATTR, and KVM_HAS_DEVICE_ATTR. The interface uses the same struct
+kvm_device_attr as other devices, but targets VCPU-wide settings and controls.
+
+The groups and attributes per virtual cpu, if any, are architecture specific.
+
+1. GROUP: KVM_ARM_VCPU_PMU_V3_CTRL
+==================================
+
+:Architectures: ARM64
+
+1.1. ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_IRQ
+---------------------------------------
+
+:Parameters: in kvm_device_attr.addr the address for PMU overflow interrupt is a
+            pointer to an int
+
+Returns:
+
+        =======  ========================================================
+        -EBUSY   The PMU overflow interrupt is already set
+        -ENXIO   The overflow interrupt not set when attempting to get it
+        -ENODEV  PMUv3 not supported
+        -EINVAL  Invalid PMU overflow interrupt number supplied or
+                 trying to set the IRQ number without using an in-kernel
+                 irqchip.
+        =======  ========================================================
+
+A value describing the PMUv3 (Performance Monitor Unit v3) overflow interrupt
+number for this vcpu. This interrupt could be a PPI or SPI, but the interrupt
+type must be same for each vcpu. As a PPI, the interrupt number is the same for
+all vcpus, while as an SPI it must be a separate number per vcpu.
+
+1.2 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_INIT
+---------------------------------------
+
+:Parameters: no additional parameter in kvm_device_attr.addr
+
+Returns:
+
+        =======  ======================================================
+        -ENODEV  PMUv3 not supported or GIC not initialized
+        -ENXIO   PMUv3 not properly configured or in-kernel irqchip not
+                 configured as required prior to calling this attribute
+        -EBUSY   PMUv3 already initialized
+        =======  ======================================================
+
+Request the initialization of the PMUv3.  If using the PMUv3 with an in-kernel
+virtual GIC implementation, this must be done after initializing the in-kernel
+irqchip.
+
+
+2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
+=================================
+
+:Architectures: ARM, ARM64
+
+2.1. ATTRIBUTES: KVM_ARM_VCPU_TIMER_IRQ_VTIMER, KVM_ARM_VCPU_TIMER_IRQ_PTIMER
+-----------------------------------------------------------------------------
+
+:Parameters: in kvm_device_attr.addr the address for the timer interrupt is a
+            pointer to an int
+
+Returns:
+
+        =======  =================================
+        -EINVAL  Invalid timer interrupt number
+        -EBUSY   One or more VCPUs has already run
+        =======  =================================
+
+A value describing the architected timer interrupt number when connected to an
+in-kernel virtual GIC.  These must be a PPI (16 <= intid < 32).  Setting the
+attribute overrides the default values (see below).
+
+=============================  ==========================================
+KVM_ARM_VCPU_TIMER_IRQ_VTIMER  The EL1 virtual timer intid (default: 27)
+KVM_ARM_VCPU_TIMER_IRQ_PTIMER  The EL1 physical timer intid (default: 30)
+=============================  ==========================================
+
+Setting the same PPI for different timers will prevent the VCPUs from running.
+Setting the interrupt number on a VCPU configures all VCPUs created at that
+time to use the number provided for a given timer, overwriting any previously
+configured values on other VCPUs.  Userspace should configure the interrupt
+numbers on at least one VCPU after creating all VCPUs and before running any
+VCPUs.
+
+3. GROUP: KVM_ARM_VCPU_PVTIME_CTRL
+==================================
+
+:Architectures: ARM64
+
+3.1 ATTRIBUTE: KVM_ARM_VCPU_PVTIME_IPA
+--------------------------------------
+
+:Parameters: 64-bit base address
+
+Returns:
+
+        =======  ======================================
+        -ENXIO   Stolen time not implemented
+        -EEXIST  Base address already set for this VCPU
+        -EINVAL  Base address not 64 byte aligned
+        =======  ======================================
+
+Specifies the base address of the stolen time structure for this VCPU. The
+base address must be 64 byte aligned and exist within a valid guest memory
+region. See Documentation/virt/kvm/arm/pvtime.txt for more information
+including the layout of the stolen time structure.
diff --git a/Documentation/virt/kvm/devices/vcpu.txt b/Documentation/virt/kvm/devices/vcpu.txt

deleted file mode 100644 (file)

index 6f3bd64..0000000
--- a/Documentation/virt/kvm/devices/vcpu.txt
+++ /dev/null
@@ -1,76 +0,0 @@
-Generic vcpu interface
-====================================
-
-The virtual cpu "device" also accepts the ioctls KVM_SET_DEVICE_ATTR,
-KVM_GET_DEVICE_ATTR, and KVM_HAS_DEVICE_ATTR. The interface uses the same struct
-kvm_device_attr as other devices, but targets VCPU-wide settings and controls.
-
-The groups and attributes per virtual cpu, if any, are architecture specific.
-
-1. GROUP: KVM_ARM_VCPU_PMU_V3_CTRL
-Architectures: ARM64
-
-1.1. ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_IRQ
-Parameters: in kvm_device_attr.addr the address for PMU overflow interrupt is a
-            pointer to an int
-Returns: -EBUSY: The PMU overflow interrupt is already set
-         -ENXIO: The overflow interrupt not set when attempting to get it
-         -ENODEV: PMUv3 not supported
-         -EINVAL: Invalid PMU overflow interrupt number supplied or
-                  trying to set the IRQ number without using an in-kernel
-                  irqchip.
-
-A value describing the PMUv3 (Performance Monitor Unit v3) overflow interrupt
-number for this vcpu. This interrupt could be a PPI or SPI, but the interrupt
-type must be same for each vcpu. As a PPI, the interrupt number is the same for
-all vcpus, while as an SPI it must be a separate number per vcpu.
-
-1.2 ATTRIBUTE: KVM_ARM_VCPU_PMU_V3_INIT
-Parameters: no additional parameter in kvm_device_attr.addr
-Returns: -ENODEV: PMUv3 not supported or GIC not initialized
-         -ENXIO: PMUv3 not properly configured or in-kernel irqchip not
-                 configured as required prior to calling this attribute
-         -EBUSY: PMUv3 already initialized
-
-Request the initialization of the PMUv3.  If using the PMUv3 with an in-kernel
-virtual GIC implementation, this must be done after initializing the in-kernel
-irqchip.
-
-
-2. GROUP: KVM_ARM_VCPU_TIMER_CTRL
-Architectures: ARM,ARM64
-
-2.1. ATTRIBUTE: KVM_ARM_VCPU_TIMER_IRQ_VTIMER
-2.2. ATTRIBUTE: KVM_ARM_VCPU_TIMER_IRQ_PTIMER
-Parameters: in kvm_device_attr.addr the address for the timer interrupt is a
-            pointer to an int
-Returns: -EINVAL: Invalid timer interrupt number
-         -EBUSY:  One or more VCPUs has already run
-
-A value describing the architected timer interrupt number when connected to an
-in-kernel virtual GIC.  These must be a PPI (16 <= intid < 32).  Setting the
-attribute overrides the default values (see below).
-
-KVM_ARM_VCPU_TIMER_IRQ_VTIMER: The EL1 virtual timer intid (default: 27)
-KVM_ARM_VCPU_TIMER_IRQ_PTIMER: The EL1 physical timer intid (default: 30)
-
-Setting the same PPI for different timers will prevent the VCPUs from running.
-Setting the interrupt number on a VCPU configures all VCPUs created at that
-time to use the number provided for a given timer, overwriting any previously
-configured values on other VCPUs.  Userspace should configure the interrupt
-numbers on at least one VCPU after creating all VCPUs and before running any
-VCPUs.
-
-3. GROUP: KVM_ARM_VCPU_PVTIME_CTRL
-Architectures: ARM64
-
-3.1 ATTRIBUTE: KVM_ARM_VCPU_PVTIME_IPA
-Parameters: 64-bit base address
-Returns: -ENXIO:  Stolen time not implemented
-         -EEXIST: Base address already set for this VCPU
-         -EINVAL: Base address not 64 byte aligned
-
-Specifies the base address of the stolen time structure for this VCPU. The
-base address must be 64 byte aligned and exist within a valid guest memory
-region. See Documentation/virt/kvm/arm/pvtime.txt for more information
-including the layout of the stolen time structure.
diff --git a/Documentation/virt/kvm/devices/vfio.txt b/Documentation/virt/kvm/devices/vfio.rst

similarity index 72%

rename from Documentation/virt/kvm/devices/vfio.txt

rename to Documentation/virt/kvm/devices/vfio.rst

index 528c77c8022c66e846bb8a41cbacd999fcd069dc..2d20dc56106946c4b79c073789a44eaca91dfebf 100644 (file)
--- a/Documentation/virt/kvm/devices/vfio.txt
+++ b/Documentation/virt/kvm/devices/vfio.rst
@@ -1,8 +1,12 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================
  VFIO virtual device
  ===================
  
  Device types supported:
-  KVM_DEV_TYPE_VFIO
+
+  - KVM_DEV_TYPE_VFIO
  
  Only one VFIO instance may be created per VM.  The created device
  tracks VFIO groups in use by the VM and features of those groups
@@ -23,14 +27,15 @@ KVM_DEV_VFIO_GROUP attributes:
         for the VFIO group.
    KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table
         allocated by sPAPR KVM.
-       kvm_device_attr.addr points to a struct:
+       kvm_device_attr.addr points to a struct::
+
+               struct kvm_vfio_spapr_tce {
+                       __s32   groupfd;
+                       __s32   tablefd;
+               };
  
-       struct kvm_vfio_spapr_tce {
-               __s32   groupfd;
-               __s32   tablefd;
-       };
+       where:
  
-       where
-       @groupfd is a file descriptor for a VFIO group;
-       @tablefd is a file descriptor for a TCE table allocated via
-               KVM_CREATE_SPAPR_TCE.
+       - @groupfd is a file descriptor for a VFIO group;
+       - @tablefd is a file descriptor for a TCE table allocated via
+         KVM_CREATE_SPAPR_TCE.
diff --git a/Documentation/virt/kvm/devices/vm.txt b/Documentation/virt/kvm/devices/vm.rst

similarity index 61%

rename from Documentation/virt/kvm/devices/vm.txt

rename to Documentation/virt/kvm/devices/vm.rst

index 4ffb82b0246838d3022fb094b3996fa8de84e9ec..0aa5b1cfd700c485bbbb4385ff63a94e231c3dcf 100644 (file)
--- a/Documentation/virt/kvm/devices/vm.txt
+++ b/Documentation/virt/kvm/devices/vm.rst
@@ -1,5 +1,8 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====================
  Generic vm interface
-====================================
+====================
  
  The virtual machine "device" also accepts the ioctls KVM_SET_DEVICE_ATTR,
  KVM_GET_DEVICE_ATTR, and KVM_HAS_DEVICE_ATTR. The interface uses the same
@@ -10,30 +13,38 @@ The groups and attributes per virtual machine, if any, are architecture
  specific.
  
  1. GROUP: KVM_S390_VM_MEM_CTRL
-Architectures: s390
+==============================
+
+:Architectures: s390
  
  1.1. ATTRIBUTE: KVM_S390_VM_MEM_ENABLE_CMMA
-Parameters: none
-Returns: -EBUSY if a vcpu is already defined, otherwise 0
+-------------------------------------------
+
+:Parameters: none
+:Returns: -EBUSY if a vcpu is already defined, otherwise 0
  
  Enables Collaborative Memory Management Assist (CMMA) for the virtual machine.
  
  1.2. ATTRIBUTE: KVM_S390_VM_MEM_CLR_CMMA
-Parameters: none
-Returns: -EINVAL if CMMA was not enabled
-         0 otherwise
+----------------------------------------
+
+:Parameters: none
+:Returns: -EINVAL if CMMA was not enabled;
+         0 otherwise
  
  Clear the CMMA status for all guest pages, so any pages the guest marked
  as unused are again used any may not be reclaimed by the host.
  
  1.3. ATTRIBUTE KVM_S390_VM_MEM_LIMIT_SIZE
-Parameters: in attr->addr the address for the new limit of guest memory
-Returns: -EFAULT if the given address is not accessible
-         -EINVAL if the virtual machine is of type UCONTROL
-         -E2BIG if the given guest memory is to big for that machine
-         -EBUSY if a vcpu is already defined
-         -ENOMEM if not enough memory is available for a new shadow guest mapping
-          0 otherwise
+-----------------------------------------
+
+:Parameters: in attr->addr the address for the new limit of guest memory
+:Returns: -EFAULT if the given address is not accessible;
+         -EINVAL if the virtual machine is of type UCONTROL;
+         -E2BIG if the given guest memory is to big for that machine;
+         -EBUSY if a vcpu is already defined;
+         -ENOMEM if not enough memory is available for a new shadow guest mapping;
+         0 otherwise.
  
  Allows userspace to query the actual limit and set a new limit for
  the maximum guest memory size. The limit will be rounded up to
@@ -42,78 +53,92 @@ the number of page table levels. In the case that there is no limit we will set
  the limit to KVM_S390_NO_MEM_LIMIT (U64_MAX).
  
  2. GROUP: KVM_S390_VM_CPU_MODEL
-Architectures: s390
+===============================
+
+:Architectures: s390
  
  2.1. ATTRIBUTE: KVM_S390_VM_CPU_MACHINE (r/o)
+---------------------------------------------
  
-Allows user space to retrieve machine and kvm specific cpu related information:
+Allows user space to retrieve machine and kvm specific cpu related information::
  
-struct kvm_s390_vm_cpu_machine {
+  struct kvm_s390_vm_cpu_machine {
         __u64 cpuid;           # CPUID of host
         __u32 ibc;             # IBC level range offered by host
         __u8  pad[4];
         __u64 fac_mask[256];   # set of cpu facilities enabled by KVM
         __u64 fac_list[256];   # set of cpu facilities offered by host
-}
+  }
  
-Parameters: address of buffer to store the machine related cpu data
-            of type struct kvm_s390_vm_cpu_machine*
-Returns:    -EFAULT if the given address is not accessible from kernel space
-           -ENOMEM if not enough memory is available to process the ioctl
-           0 in case of success
+:Parameters: address of buffer to store the machine related cpu data
+            of type struct kvm_s390_vm_cpu_machine*
+:Returns:   -EFAULT if the given address is not accessible from kernel space;
+           -ENOMEM if not enough memory is available to process the ioctl;
+           0 in case of success.
  
  2.2. ATTRIBUTE: KVM_S390_VM_CPU_PROCESSOR (r/w)
+===============================================
  
-Allows user space to retrieve or request to change cpu related information for a vcpu:
+Allows user space to retrieve or request to change cpu related information for a vcpu::
  
-struct kvm_s390_vm_cpu_processor {
+  struct kvm_s390_vm_cpu_processor {
         __u64 cpuid;           # CPUID currently (to be) used by this vcpu
         __u16 ibc;             # IBC level currently (to be) used by this vcpu
         __u8  pad[6];
         __u64 fac_list[256];   # set of cpu facilities currently (to be) used
-                              # by this vcpu
-}
+                             # by this vcpu
+  }
  
  KVM does not enforce or limit the cpu model data in any form. Take the information
  retrieved by means of KVM_S390_VM_CPU_MACHINE as hint for reasonable configuration
  setups. Instruction interceptions triggered by additionally set facility bits that
  are not handled by KVM need to by imlemented in the VM driver code.
  
-Parameters: address of buffer to store/set the processor related cpu
-           data of type struct kvm_s390_vm_cpu_processor*.
-Returns:    -EBUSY in case 1 or more vcpus are already activated (only in write case)
-           -EFAULT if the given address is not accessible from kernel space
-           -ENOMEM if not enough memory is available to process the ioctl
-           0 in case of success
+:Parameters: address of buffer to store/set the processor related cpu
+            data of type struct kvm_s390_vm_cpu_processor*.
+:Returns:  -EBUSY in case 1 or more vcpus are already activated (only in write case);
+          -EFAULT if the given address is not accessible from kernel space;
+          -ENOMEM if not enough memory is available to process the ioctl;
+          0 in case of success.
+
+.. _KVM_S390_VM_CPU_MACHINE_FEAT:
  
  2.3. ATTRIBUTE: KVM_S390_VM_CPU_MACHINE_FEAT (r/o)
+--------------------------------------------------
  
  Allows user space to retrieve available cpu features. A feature is available if
  provided by the hardware and supported by kvm. In theory, cpu features could
  even be completely emulated by kvm.
  
-struct kvm_s390_vm_cpu_feat {
-        __u64 feat[16]; # Bitmap (1 = feature available), MSB 0 bit numbering
-};
+::
  
-Parameters: address of a buffer to load the feature list from.
-Returns:    -EFAULT if the given address is not accessible from kernel space.
-           0 in case of success.
+  struct kvm_s390_vm_cpu_feat {
+       __u64 feat[16]; # Bitmap (1 = feature available), MSB 0 bit numbering
+  };
+
+:Parameters: address of a buffer to load the feature list from.
+:Returns:  -EFAULT if the given address is not accessible from kernel space;
+          0 in case of success.
  
  2.4. ATTRIBUTE: KVM_S390_VM_CPU_PROCESSOR_FEAT (r/w)
+----------------------------------------------------
  
  Allows user space to retrieve or change enabled cpu features for all VCPUs of a
  VM. Features that are not available cannot be enabled.
  
-See 2.3. for a description of the parameter struct.
+See :ref:`KVM_S390_VM_CPU_MACHINE_FEAT` for
+a description of the parameter struct.
  
-Parameters: address of a buffer to store/load the feature list from.
-Returns:    -EFAULT if the given address is not accessible from kernel space.
-           -EINVAL if a cpu feature that is not available is to be enabled.
-           -EBUSY if at least one VCPU has already been defined.
+:Parameters: address of a buffer to store/load the feature list from.
+:Returns:   -EFAULT if the given address is not accessible from kernel space;
+           -EINVAL if a cpu feature that is not available is to be enabled;
+           -EBUSY if at least one VCPU has already been defined;
             0 in case of success.
  
+.. _KVM_S390_VM_CPU_MACHINE_SUBFUNC:
+
  2.5. ATTRIBUTE: KVM_S390_VM_CPU_MACHINE_SUBFUNC (r/o)
+-----------------------------------------------------
  
  Allows user space to retrieve available cpu subfunctions without any filtering
  done by a set IBC. These subfunctions are indicated to the guest VCPU via
@@ -126,7 +151,9 @@ contained in the returned struct. If the affected instruction
  indicates subfunctions via a "test bit" mechanism, the subfunction codes are
  contained in the returned struct in MSB 0 bit numbering.
  
-struct kvm_s390_vm_cpu_subfunc {
+::
+
+  struct kvm_s390_vm_cpu_subfunc {
         u8 plo[32];           # always valid (ESA/390 feature)
         u8 ptff[16];          # valid with TOD-clock steering
         u8 kmac[16];          # valid with Message-Security-Assist
@@ -143,13 +170,14 @@ struct kvm_s390_vm_cpu_subfunc {
         u8 kma[16];           # valid with Message-Security-Assist-Extension 8
         u8 kdsa[16];          # valid with Message-Security-Assist-Extension 9
         u8 reserved[1792];    # reserved for future instructions
-};
+  };
  
-Parameters: address of a buffer to load the subfunction blocks from.
-Returns:    -EFAULT if the given address is not accessible from kernel space.
+:Parameters: address of a buffer to load the subfunction blocks from.
+:Returns:   -EFAULT if the given address is not accessible from kernel space;
             0 in case of success.
  
  2.6. ATTRIBUTE: KVM_S390_VM_CPU_PROCESSOR_SUBFUNC (r/w)
+-------------------------------------------------------
  
  Allows user space to retrieve or change cpu subfunctions to be indicated for
  all VCPUs of a VM. This attribute will only be available if kernel and
@@ -164,107 +192,125 @@ As long as no data has been written, a read will fail. The IBC will be used
  to determine available subfunctions in this case, this will guarantee backward
  compatibility.
  
-See 2.5. for a description of the parameter struct.
+See :ref:`KVM_S390_VM_CPU_MACHINE_SUBFUNC` for a
+description of the parameter struct.
  
-Parameters: address of a buffer to store/load the subfunction blocks from.
-Returns:    -EFAULT if the given address is not accessible from kernel space.
-           -EINVAL when reading, if there was no write yet.
-           -EBUSY if at least one VCPU has already been defined.
+:Parameters: address of a buffer to store/load the subfunction blocks from.
+:Returns:   -EFAULT if the given address is not accessible from kernel space;
+           -EINVAL when reading, if there was no write yet;
+           -EBUSY if at least one VCPU has already been defined;
             0 in case of success.
  
  3. GROUP: KVM_S390_VM_TOD
-Architectures: s390
+=========================
+
+:Architectures: s390
  
  3.1. ATTRIBUTE: KVM_S390_VM_TOD_HIGH
+------------------------------------
  
  Allows user space to set/get the TOD clock extension (u8) (superseded by
  KVM_S390_VM_TOD_EXT).
  
-Parameters: address of a buffer in user space to store the data (u8) to
-Returns:    -EFAULT if the given address is not accessible from kernel space
+:Parameters: address of a buffer in user space to store the data (u8) to
+:Returns:   -EFAULT if the given address is not accessible from kernel space;
             -EINVAL if setting the TOD clock extension to != 0 is not supported
  
  3.2. ATTRIBUTE: KVM_S390_VM_TOD_LOW
+-----------------------------------
  
  Allows user space to set/get bits 0-63 of the TOD clock register as defined in
  the POP (u64).
  
-Parameters: address of a buffer in user space to store the data (u64) to
-Returns:    -EFAULT if the given address is not accessible from kernel space
+:Parameters: address of a buffer in user space to store the data (u64) to
+:Returns:    -EFAULT if the given address is not accessible from kernel space
  
  3.3. ATTRIBUTE: KVM_S390_VM_TOD_EXT
+-----------------------------------
+
  Allows user space to set/get bits 0-63 of the TOD clock register as defined in
  the POP (u64). If the guest CPU model supports the TOD clock extension (u8), it
  also allows user space to get/set it. If the guest CPU model does not support
  it, it is stored as 0 and not allowed to be set to a value != 0.
  
-Parameters: address of a buffer in user space to store the data
-            (kvm_s390_vm_tod_clock) to
-Returns:    -EFAULT if the given address is not accessible from kernel space
+:Parameters: address of a buffer in user space to store the data
+            (kvm_s390_vm_tod_clock) to
+:Returns:   -EFAULT if the given address is not accessible from kernel space;
             -EINVAL if setting the TOD clock extension to != 0 is not supported
  
  4. GROUP: KVM_S390_VM_CRYPTO
-Architectures: s390
+============================
+
+:Architectures: s390
  
  4.1. ATTRIBUTE: KVM_S390_VM_CRYPTO_ENABLE_AES_KW (w/o)
+------------------------------------------------------
  
  Allows user space to enable aes key wrapping, including generating a new
  wrapping key.
  
-Parameters: none
-Returns:    0
+:Parameters: none
+:Returns:    0
  
  4.2. ATTRIBUTE: KVM_S390_VM_CRYPTO_ENABLE_DEA_KW (w/o)
+------------------------------------------------------
  
  Allows user space to enable dea key wrapping, including generating a new
  wrapping key.
  
-Parameters: none
-Returns:    0
+:Parameters: none
+:Returns:    0
  
  4.3. ATTRIBUTE: KVM_S390_VM_CRYPTO_DISABLE_AES_KW (w/o)
+-------------------------------------------------------
  
  Allows user space to disable aes key wrapping, clearing the wrapping key.
  
-Parameters: none
-Returns:    0
+:Parameters: none
+:Returns:    0
  
  4.4. ATTRIBUTE: KVM_S390_VM_CRYPTO_DISABLE_DEA_KW (w/o)
+-------------------------------------------------------
  
  Allows user space to disable dea key wrapping, clearing the wrapping key.
  
-Parameters: none
-Returns:    0
+:Parameters: none
+:Returns:    0
  
  5. GROUP: KVM_S390_VM_MIGRATION
-Architectures: s390
+===============================
+
+:Architectures: s390
  
  5.1. ATTRIBUTE: KVM_S390_VM_MIGRATION_STOP (w/o)
+------------------------------------------------
  
  Allows userspace to stop migration mode, needed for PGSTE migration.
  Setting this attribute when migration mode is not active will have no
  effects.
  
-Parameters: none
-Returns:    0
+:Parameters: none
+:Returns:    0
  
  5.2. ATTRIBUTE: KVM_S390_VM_MIGRATION_START (w/o)
+-------------------------------------------------
  
  Allows userspace to start migration mode, needed for PGSTE migration.
  Setting this attribute when migration mode is already active will have
  no effects.
  
-Parameters: none
-Returns:    -ENOMEM if there is not enough free memory to start migration mode
-           -EINVAL if the state of the VM is invalid (e.g. no memory defined)
+:Parameters: none
+:Returns:   -ENOMEM if there is not enough free memory to start migration mode;
+           -EINVAL if the state of the VM is invalid (e.g. no memory defined);
             0 in case of success.
  
  5.3. ATTRIBUTE: KVM_S390_VM_MIGRATION_STATUS (r/o)
+--------------------------------------------------
  
  Allows userspace to query the status of migration mode.
  
-Parameters: address of a buffer in user space to store the data (u64) to;
-           the data itself is either 0 if migration mode is disabled or 1
-           if it is enabled
-Returns:    -EFAULT if the given address is not accessible from kernel space
+:Parameters: address of a buffer in user space to store the data (u64) to;
+            the data itself is either 0 if migration mode is disabled or 1
+            if it is enabled
+:Returns:   -EFAULT if the given address is not accessible from kernel space;
             0 in case of success.
diff --git a/Documentation/virt/kvm/devices/xics.txt b/Documentation/virt/kvm/devices/xics.rst

similarity index 84%

rename from Documentation/virt/kvm/devices/xics.txt

rename to Documentation/virt/kvm/devices/xics.rst

index 423332dda7bc893104a4e12d3b71b482ea590c19..2d6927e0b776b370bdcb9213b8d5bfe40c990d4f 100644 (file)
--- a/Documentation/virt/kvm/devices/xics.txt
+++ b/Documentation/virt/kvm/devices/xics.rst
@@ -1,20 +1,31 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=========================
  XICS interrupt controller
+=========================
  
  Device type supported: KVM_DEV_TYPE_XICS
  
  Groups:
    1. KVM_DEV_XICS_GRP_SOURCES
-  Attributes: One per interrupt source, indexed by the source number.
+       Attributes:
  
+         One per interrupt source, indexed by the source number.
    2. KVM_DEV_XICS_GRP_CTRL
-  Attributes:
-    2.1 KVM_DEV_XICS_NR_SERVERS (write only)
+       Attributes:
+
+         2.1 KVM_DEV_XICS_NR_SERVERS (write only)
+
    The kvm_device_attr.addr points to a __u32 value which is the number of
    interrupt server numbers (ie, highest possible vcpu id plus one).
+
    Errors:
-    -EINVAL: Value greater than KVM_MAX_VCPU_ID.
-    -EFAULT: Invalid user pointer for attr->addr.
-    -EBUSY:  A vcpu is already connected to the device.
+
+    =======  ==========================================
+    -EINVAL  Value greater than KVM_MAX_VCPU_ID.
+    -EFAULT  Invalid user pointer for attr->addr.
+    -EBUSY   A vcpu is already connected to the device.
+    =======  ==========================================
  
  This device emulates the XICS (eXternal Interrupt Controller
  Specification) defined in PAPR.  The XICS has a set of interrupt
@@ -53,24 +64,29 @@ the interrupt source number.  The 64 bit state word has the following
  bitfields, starting from the least-significant end of the word:
  
  * Destination (server number), 32 bits
+
    This specifies where the interrupt should be sent, and is the
    interrupt server number specified for the destination vcpu.
  
  * Priority, 8 bits
+
    This is the priority specified for this interrupt source, where 0 is
    the highest priority and 255 is the lowest.  An interrupt with a
    priority of 255 will never be delivered.
  
  * Level sensitive flag, 1 bit
+
    This bit is 1 for a level-sensitive interrupt source, or 0 for
    edge-sensitive (or MSI).
  
  * Masked flag, 1 bit
+
    This bit is set to 1 if the interrupt is masked (cannot be delivered
    regardless of its priority), for example by the ibm,int-off RTAS
    call, or 0 if it is not masked.
  
  * Pending flag, 1 bit
+
    This bit is 1 if the source has a pending interrupt, otherwise 0.
  
  Only one XICS instance may be created per VM.
diff --git a/Documentation/virt/kvm/devices/xive.txt b/Documentation/virt/kvm/devices/xive.rst

similarity index 62%

rename from Documentation/virt/kvm/devices/xive.txt

rename to Documentation/virt/kvm/devices/xive.rst

index f5d1d6b5af61504dd501192e61c71d6333ceb83a..8bdf3dc38f01608dc466623b1e0511e6f8e918e6 100644 (file)
--- a/Documentation/virt/kvm/devices/xive.txt
+++ b/Documentation/virt/kvm/devices/xive.rst
@@ -1,8 +1,11 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===========================================================
  POWER9 eXternal Interrupt Virtualization Engine (XIVE Gen1)
-==========================================================
+===========================================================
  
  Device types supported:
-  KVM_DEV_TYPE_XIVE     POWER9 XIVE Interrupt Controller generation 1
+  - KVM_DEV_TYPE_XIVE     POWER9 XIVE Interrupt Controller generation 1
  
  This device acts as a VM interrupt controller. It provides the KVM
  interface to configure the interrupt sources of a VM in the underlying
@@ -64,72 +67,100 @@ the legacy interrupt mode, referred as XICS (POWER7/8).
  
  * Groups:
  
-  1. KVM_DEV_XIVE_GRP_CTRL
-  Provides global controls on the device
+1. KVM_DEV_XIVE_GRP_CTRL
+     Provides global controls on the device
+
    Attributes:
      1.1 KVM_DEV_XIVE_RESET (write only)
      Resets the interrupt controller configuration for sources and event
      queues. To be used by kexec and kdump.
+
      Errors: none
  
      1.2 KVM_DEV_XIVE_EQ_SYNC (write only)
      Sync all the sources and queues and mark the EQ pages dirty. This
      to make sure that a consistent memory state is captured when
      migrating the VM.
+
      Errors: none
  
      1.3 KVM_DEV_XIVE_NR_SERVERS (write only)
      The kvm_device_attr.addr points to a __u32 value which is the number of
      interrupt server numbers (ie, highest possible vcpu id plus one).
+
      Errors:
-      -EINVAL: Value greater than KVM_MAX_VCPU_ID.
-      -EFAULT: Invalid user pointer for attr->addr.
-      -EBUSY:  A vCPU is already connected to the device.
  
-  2. KVM_DEV_XIVE_GRP_SOURCE (write only)
-  Initializes a new source in the XIVE device and mask it.
+      =======  ==========================================
+      -EINVAL  Value greater than KVM_MAX_VCPU_ID.
+      -EFAULT  Invalid user pointer for attr->addr.
+      -EBUSY   A vCPU is already connected to the device.
+      =======  ==========================================
+
+2. KVM_DEV_XIVE_GRP_SOURCE (write only)
+     Initializes a new source in the XIVE device and mask it.
+
    Attributes:
      Interrupt source number  (64-bit)
-  The kvm_device_attr.addr points to a __u64 value:
-  bits:     | 63   ....  2 |   1   |   0
-  values:   |    unused    | level | type
+
+  The kvm_device_attr.addr points to a __u64 value::
+
+    bits:     | 63   ....  2 |   1   |   0
+    values:   |    unused    | level | type
+
    - type:  0:MSI 1:LSI
    - level: assertion level in case of an LSI.
+
    Errors:
-    -E2BIG:  Interrupt source number is out of range
-    -ENOMEM: Could not create a new source block
-    -EFAULT: Invalid user pointer for attr->addr.
-    -ENXIO:  Could not allocate underlying HW interrupt
  
-  3. KVM_DEV_XIVE_GRP_SOURCE_CONFIG (write only)
-  Configures source targeting
+    =======  ==========================================
+    -E2BIG   Interrupt source number is out of range
+    -ENOMEM  Could not create a new source block
+    -EFAULT  Invalid user pointer for attr->addr.
+    -ENXIO   Could not allocate underlying HW interrupt
+    =======  ==========================================
+
+3. KVM_DEV_XIVE_GRP_SOURCE_CONFIG (write only)
+     Configures source targeting
+
    Attributes:
      Interrupt source number  (64-bit)
-  The kvm_device_attr.addr points to a __u64 value:
-  bits:     | 63   ....  33 |  32  | 31 .. 3 |  2 .. 0
-  values:   |    eisn       | mask |  server | priority
+
+  The kvm_device_attr.addr points to a __u64 value::
+
+    bits:     | 63   ....  33 |  32  | 31 .. 3 |  2 .. 0
+    values:   |    eisn       | mask |  server | priority
+
    - priority: 0-7 interrupt priority level
    - server: CPU number chosen to handle the interrupt
    - mask: mask flag (unused)
    - eisn: Effective Interrupt Source Number
+
    Errors:
-    -ENOENT: Unknown source number
-    -EINVAL: Not initialized source number
-    -EINVAL: Invalid priority
-    -EINVAL: Invalid CPU number.
-    -EFAULT: Invalid user pointer for attr->addr.
-    -ENXIO:  CPU event queues not configured or configuration of the
-             underlying HW interrupt failed
-    -EBUSY:  No CPU available to serve interrupt
-
-  4. KVM_DEV_XIVE_GRP_EQ_CONFIG (read-write)
-  Configures an event queue of a CPU
+
+    =======  =======================================================
+    -ENOENT  Unknown source number
+    -EINVAL  Not initialized source number
+    -EINVAL  Invalid priority
+    -EINVAL  Invalid CPU number.
+    -EFAULT  Invalid user pointer for attr->addr.
+    -ENXIO   CPU event queues not configured or configuration of the
+            underlying HW interrupt failed
+    -EBUSY   No CPU available to serve interrupt
+    =======  =======================================================
+
+4. KVM_DEV_XIVE_GRP_EQ_CONFIG (read-write)
+     Configures an event queue of a CPU
+
    Attributes:
      EQ descriptor identifier (64-bit)
-  The EQ descriptor identifier is a tuple (server, priority) :
-  bits:     | 63   ....  32 | 31 .. 3 |  2 .. 0
-  values:   |    unused     |  server | priority
-  The kvm_device_attr.addr points to :
+
+  The EQ descriptor identifier is a tuple (server, priority)::
+
+    bits:     | 63   ....  32 | 31 .. 3 |  2 .. 0
+    values:   |    unused     |  server | priority
+
+  The kvm_device_attr.addr points to::
+
      struct kvm_ppc_xive_eq {
         __u32 flags;
         __u32 qshift;
@@ -138,8 +169,9 @@ the legacy interrupt mode, referred as XICS (POWER7/8).
         __u32 qindex;
         __u8  pad[40];
      };
+
    - flags: queue flags
-    KVM_XIVE_EQ_ALWAYS_NOTIFY (required)
+      KVM_XIVE_EQ_ALWAYS_NOTIFY (required)
         forces notification without using the coalescing mechanism
         provided by the XIVE END ESBs.
    - qshift: queue size (power of 2)
@@ -147,22 +179,31 @@ the legacy interrupt mode, referred as XICS (POWER7/8).
    - qtoggle: current queue toggle bit
    - qindex: current queue index
    - pad: reserved for future use
+
    Errors:
-    -ENOENT: Invalid CPU number
-    -EINVAL: Invalid priority
-    -EINVAL: Invalid flags
-    -EINVAL: Invalid queue size
-    -EINVAL: Invalid queue address
-    -EFAULT: Invalid user pointer for attr->addr.
-    -EIO:    Configuration of the underlying HW failed
-
-  5. KVM_DEV_XIVE_GRP_SOURCE_SYNC (write only)
-  Synchronize the source to flush event notifications
+
+    =======  =========================================
+    -ENOENT  Invalid CPU number
+    -EINVAL  Invalid priority
+    -EINVAL  Invalid flags
+    -EINVAL  Invalid queue size
+    -EINVAL  Invalid queue address
+    -EFAULT  Invalid user pointer for attr->addr.
+    -EIO     Configuration of the underlying HW failed
+    =======  =========================================
+
+5. KVM_DEV_XIVE_GRP_SOURCE_SYNC (write only)
+     Synchronize the source to flush event notifications
+
    Attributes:
      Interrupt source number  (64-bit)
+
    Errors:
-    -ENOENT: Unknown source number
-    -EINVAL: Not initialized source number
+
+    =======  =============================
+    -ENOENT  Unknown source number
+    -EINVAL  Not initialized source number
+    =======  =============================
  
  * VCPU state
  
@@ -175,11 +216,12 @@ the legacy interrupt mode, referred as XICS (POWER7/8).
    as it synthesizes the priorities of the pending interrupts. We
    capture a bit more to report debug information.
  
-  KVM_REG_PPC_VP_STATE (2 * 64bits)
-  bits:     |  63  ....  32  |  31  ....  0  |
-  values:   |   TIMA word0   |   TIMA word1  |
-  bits:     | 127       ..........       64  |
-  values:   |            unused              |
+  KVM_REG_PPC_VP_STATE (2 * 64bits)::
+
+    bits:     |  63  ....  32  |  31  ....  0  |
+    values:   |   TIMA word0   |   TIMA word1  |
+    bits:     | 127       ..........       64  |
+    values:   |            unused              |
  
  * Migration:
  
@@ -196,7 +238,7 @@ the legacy interrupt mode, referred as XICS (POWER7/8).
    3. Capture the state of the source targeting, the EQs configuration
    and the state of thread interrupt context registers.
  
-  Restore is similar :
+  Restore is similar:
  
    1. Restore the EQ configuration. As targeting depends on it.
    2. Restore targeting
diff --git a/Documentation/virt/kvm/halt-polling.txt b/Documentation/virt/kvm/halt-polling.rst

similarity index 64%

rename from Documentation/virt/kvm/halt-polling.txt

rename to Documentation/virt/kvm/halt-polling.rst

index 4f791b128dd27a0ed9bc4ad79eddc8794bcab2bd..4922e4a15f18412fa1ceb39cb5891fa0ee8839b6 100644 (file)
--- a/Documentation/virt/kvm/halt-polling.txt
+++ b/Documentation/virt/kvm/halt-polling.rst
@@ -1,3 +1,6 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===========================
  The KVM halt polling system
  ===========================
  
@@ -68,7 +71,8 @@ steady state polling interval but will only really do a good job for wakeups
  which come at an approximately constant rate, otherwise there will be constant
  adjustment of the polling interval.
  
-[0] total block time: the time between when the halt polling function is
+[0] total block time:
+                     the time between when the halt polling function is
                       invoked and a wakeup source received (irrespective of
                       whether the scheduler is invoked within that function).
  
@@ -81,31 +85,32 @@ shrunk. These variables are defined in include/linux/kvm_host.h and as module
  parameters in virt/kvm/kvm_main.c, or arch/powerpc/kvm/book3s_hv.c in the
  powerpc kvm-hv case.
  
-Module Parameter       |   Description             |        Default Value
---------------------------------------------------------------------------------
-halt_poll_ns           | The global max polling    | KVM_HALT_POLL_NS_DEFAULT
-                       | interval which defines    |
-                       | the ceiling value of the  |
-                       | polling interval for      | (per arch value)
-                       | each vcpu.                |
---------------------------------------------------------------------------------
-halt_poll_ns_grow      | The value by which the    | 2
-                       | halt polling interval is  |
-                       | multiplied in the         |
-                       | grow_halt_poll_ns()       |
-                       | function.                 |
---------------------------------------------------------------------------------
-halt_poll_ns_grow_start | The initial value to grow | 10000
-                       | to from zero in the       |
-                       | grow_halt_poll_ns()       |
-                       | function.                 |
---------------------------------------------------------------------------------
-halt_poll_ns_shrink    | The value by which the    | 0
-                       | halt polling interval is  |
-                       | divided in the            |
-                       | shrink_halt_poll_ns()     |
-                       | function.                 |
---------------------------------------------------------------------------------
++-----------------------+---------------------------+-------------------------+
+|Module Parameter      |   Description             |        Default Value    |
++-----------------------+---------------------------+-------------------------+
+|halt_poll_ns          | The global max polling    | KVM_HALT_POLL_NS_DEFAULT|
+|                      | interval which defines    |                         |
+|                      | the ceiling value of the  |                         |
+|                      | polling interval for      | (per arch value)        |
+|                      | each vcpu.                |                         |
++-----------------------+---------------------------+-------------------------+
+|halt_poll_ns_grow     | The value by which the    | 2                       |
+|                      | halt polling interval is  |                         |
+|                      | multiplied in the         |                         |
+|                      | grow_halt_poll_ns()       |                         |
+|                      | function.                 |                         |
++-----------------------+---------------------------+-------------------------+
+|halt_poll_ns_grow_start| The initial value to grow | 10000                  |
+|                      | to from zero in the       |                         |
+|                      | grow_halt_poll_ns()       |                         |
+|                      | function.                 |                         |
++-----------------------+---------------------------+-------------------------+
+|halt_poll_ns_shrink   | The value by which the    | 0                       |
+|                      | halt polling interval is  |                         |
+|                      | divided in the            |                         |
+|                      | shrink_halt_poll_ns()     |                         |
+|                      | function.                 |                         |
++-----------------------+---------------------------+-------------------------+
  
  These module parameters can be set from the debugfs files in:
  
@@ -117,20 +122,19 @@ Note: that these module parameters are system wide values and are not able to
  Further Notes
  =============
  
-- Care should be taken when setting the halt_poll_ns module parameter as a
-large value has the potential to drive the cpu usage to 100% on a machine which
-would be almost entirely idle otherwise. This is because even if a guest has
-wakeups during which very little work is done and which are quite far apart, if
-the period is shorter than the global max polling interval (halt_poll_ns) then
-the host will always poll for the entire block time and thus cpu utilisation
-will go to 100%.
-
-- Halt polling essentially presents a trade off between power usage and latency
-and the module parameters should be used to tune the affinity for this. Idle
-cpu time is essentially converted to host kernel time with the aim of decreasing
-latency when entering the guest.
-
-- Halt polling will only be conducted by the host when no other tasks are
-runnable on that cpu, otherwise the polling will cease immediately and
-schedule will be invoked to allow that other task to run. Thus this doesn't
-allow a guest to denial of service the cpu.
+- Care should be taken when setting the halt_poll_ns module parameter as a large value
+  has the potential to drive the cpu usage to 100% on a machine which would be almost
+  entirely idle otherwise. This is because even if a guest has wakeups during which very
+  little work is done and which are quite far apart, if the period is shorter than the
+  global max polling interval (halt_poll_ns) then the host will always poll for the
+  entire block time and thus cpu utilisation will go to 100%.
+
+- Halt polling essentially presents a trade off between power usage and latency and
+  the module parameters should be used to tune the affinity for this. Idle cpu time is
+  essentially converted to host kernel time with the aim of decreasing latency when
+  entering the guest.
+
+- Halt polling will only be conducted by the host when no other tasks are runnable on
+  that cpu, otherwise the polling will cease immediately and schedule will be invoked to
+  allow that other task to run. Thus this doesn't allow a guest to denial of service the
+  cpu.
diff --git a/Documentation/virt/kvm/hypercalls.txt b/Documentation/virt/kvm/hypercalls.rst

similarity index 55%

rename from Documentation/virt/kvm/hypercalls.txt

rename to Documentation/virt/kvm/hypercalls.rst

index 5f6d291bd00459957170e3459452b290c55cc1a9..dbaf207e560d0f71c6a2f008e5958796034ff068 100644 (file)
--- a/Documentation/virt/kvm/hypercalls.txt
+++ b/Documentation/virt/kvm/hypercalls.rst
@@ -1,5 +1,9 @@
-Linux KVM Hypercall:
+.. SPDX-License-Identifier: GPL-2.0
+
+===================
+Linux KVM Hypercall
  ===================
+
  X86:
   KVM Hypercalls have a three-byte sequence of either the vmcall or the vmmcall
   instruction. The hypervisor can replace it with instructions that are
@@ -20,7 +24,7 @@ S390:
    For further information on the S390 diagnose call as supported by KVM,
    refer to Documentation/virt/kvm/s390-diag.txt.
  
- PowerPC:
+PowerPC:
    It uses R3-R10 and hypercall number in R11. R4-R11 are used as output registers.
    Return value is placed in R3.
  
@@ -34,7 +38,8 @@ MIPS:
    the return value is placed in $2 (v0).
  
  KVM Hypercalls Documentation
-===========================
+============================
+
  The template for each hypercall is:
  1. Hypercall name.
  2. Architecture(s)
@@ -43,56 +48,64 @@ The template for each hypercall is:
  
  1. KVM_HC_VAPIC_POLL_IRQ
  ------------------------
-Architecture: x86
-Status: active
-Purpose: Trigger guest exit so that the host can check for pending
-interrupts on reentry.
+
+:Architecture: x86
+:Status: active
+:Purpose: Trigger guest exit so that the host can check for pending
+          interrupts on reentry.
  
  2. KVM_HC_MMU_OP
-------------------------
-Architecture: x86
-Status: deprecated.
-Purpose: Support MMU operations such as writing to PTE,
-flushing TLB, release PT.
+----------------
+
+:Architecture: x86
+:Status: deprecated.
+:Purpose: Support MMU operations such as writing to PTE,
+          flushing TLB, release PT.
  
  3. KVM_HC_FEATURES
-------------------------
-Architecture: PPC
-Status: active
-Purpose: Expose hypercall availability to the guest. On x86 platforms, cpuid
-used to enumerate which hypercalls are available. On PPC, either device tree
-based lookup ( which is also what EPAPR dictates) OR KVM specific enumeration
-mechanism (which is this hypercall) can be used.
+------------------
+
+:Architecture: PPC
+:Status: active
+:Purpose: Expose hypercall availability to the guest. On x86 platforms, cpuid
+          used to enumerate which hypercalls are available. On PPC, either
+         device tree based lookup ( which is also what EPAPR dictates)
+         OR KVM specific enumeration mechanism (which is this hypercall)
+         can be used.
  
  4. KVM_HC_PPC_MAP_MAGIC_PAGE
-------------------------
-Architecture: PPC
-Status: active
-Purpose: To enable communication between the hypervisor and guest there is a
-shared page that contains parts of supervisor visible register state.
-The guest can map this shared page to access its supervisor register through
-memory using this hypercall.
+----------------------------
+
+:Architecture: PPC
+:Status: active
+:Purpose: To enable communication between the hypervisor and guest there is a
+         shared page that contains parts of supervisor visible register state.
+         The guest can map this shared page to access its supervisor register
+         through memory using this hypercall.
  
  5. KVM_HC_KICK_CPU
-------------------------
-Architecture: x86
-Status: active
-Purpose: Hypercall used to wakeup a vcpu from HLT state
-Usage example : A vcpu of a paravirtualized guest that is busywaiting in guest
-kernel mode for an event to occur (ex: a spinlock to become available) can
-execute HLT instruction once it has busy-waited for more than a threshold
-time-interval. Execution of HLT instruction would cause the hypervisor to put
-the vcpu to sleep until occurrence of an appropriate event. Another vcpu of the
-same guest can wakeup the sleeping vcpu by issuing KVM_HC_KICK_CPU hypercall,
-specifying APIC ID (a1) of the vcpu to be woken up. An additional argument (a0)
-is used in the hypercall for future use.
+------------------
+
+:Architecture: x86
+:Status: active
+:Purpose: Hypercall used to wakeup a vcpu from HLT state
+:Usage example:
+  A vcpu of a paravirtualized guest that is busywaiting in guest
+  kernel mode for an event to occur (ex: a spinlock to become available) can
+  execute HLT instruction once it has busy-waited for more than a threshold
+  time-interval. Execution of HLT instruction would cause the hypervisor to put
+  the vcpu to sleep until occurrence of an appropriate event. Another vcpu of the
+  same guest can wakeup the sleeping vcpu by issuing KVM_HC_KICK_CPU hypercall,
+  specifying APIC ID (a1) of the vcpu to be woken up. An additional argument (a0)
+  is used in the hypercall for future use.
  
  
  6. KVM_HC_CLOCK_PAIRING
-------------------------
-Architecture: x86
-Status: active
-Purpose: Hypercall used to synchronize host and guest clocks.
+-----------------------
+:Architecture: x86
+:Status: active
+:Purpose: Hypercall used to synchronize host and guest clocks.
+
  Usage:
  
  a0: guest physical address where host copies
@@ -101,6 +114,8 @@ a0: guest physical address where host copies
  a1: clock_type, ATM only KVM_CLOCK_PAIRING_WALLCLOCK (0)
  is supported (corresponding to the host's CLOCK_REALTIME clock).
  
+       ::
+
                 struct kvm_clock_pairing {
                         __s64 sec;
                         __s64 nsec;
@@ -123,15 +138,16 @@ Returns KVM_EOPNOTSUPP if the host does not use TSC clocksource,
  or if clock type is different than KVM_CLOCK_PAIRING_WALLCLOCK.
  
  6. KVM_HC_SEND_IPI
-------------------------
-Architecture: x86
-Status: active
-Purpose: Send IPIs to multiple vCPUs.
+------------------
+
+:Architecture: x86
+:Status: active
+:Purpose: Send IPIs to multiple vCPUs.
  
-a0: lower part of the bitmap of destination APIC IDs
-a1: higher part of the bitmap of destination APIC IDs
-a2: the lowest APIC ID in bitmap
-a3: APIC ICR
+- a0: lower part of the bitmap of destination APIC IDs
+- a1: higher part of the bitmap of destination APIC IDs
+- a2: the lowest APIC ID in bitmap
+- a3: APIC ICR
  
  The hypercall lets a guest send multicast IPIs, with at most 128
  128 destinations per hypercall in 64-bit mode and 64 vCPUs per
@@ -143,12 +159,13 @@ corresponds to the APIC ID a2+1, and so on.
  Returns the number of CPUs to which the IPIs were delivered successfully.
  
  7. KVM_HC_SCHED_YIELD
-------------------------
-Architecture: x86
-Status: active
-Purpose: Hypercall used to yield if the IPI target vCPU is preempted
+---------------------
+
+:Architecture: x86
+:Status: active
+:Purpose: Hypercall used to yield if the IPI target vCPU is preempted
  
  a0: destination APIC ID
  
-Usage example: When sending a call-function IPI-many to vCPUs, yield if
-any of the IPI target vCPUs was preempted.
+:Usage example: When sending a call-function IPI-many to vCPUs, yield if
+               any of the IPI target vCPUs was preempted.
diff --git a/Documentation/virt/kvm/index.rst b/Documentation/virt/kvm/index.rst

index ada224a511fecab1b9813d508719c75464b189cc..774deaebf7fac762b44c022d034d7998b5dcde86 100644 (file)
--- a/Documentation/virt/kvm/index.rst
+++ b/Documentation/virt/kvm/index.rst
@@ -7,6 +7,22 @@ KVM
  .. toctree::
     :maxdepth: 2
  
+   api
     amd-memory-encryption
     cpuid
+   halt-polling
+   hypercalls
+   locking
+   mmu
+   msr
+   nested-vmx
+   ppc-pv
+   s390-diag
+   timekeeping
     vcpu-requests
+
+   review-checklist
+
+   arm/index
+
+   devices/index
diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst

new file mode 100644 (file)

index 0000000..c02291b
--- /dev/null
+++ b/Documentation/virt/kvm/locking.rst
@@ -0,0 +1,243 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=================
+KVM Lock Overview
+=================
+
+1. Acquisition Orders
+---------------------
+
+The acquisition orders for mutexes are as follows:
+
+- kvm->lock is taken outside vcpu->mutex
+
+- kvm->lock is taken outside kvm->slots_lock and kvm->irq_lock
+
+- kvm->slots_lock is taken outside kvm->irq_lock, though acquiring
+  them together is quite rare.
+
+On x86, vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock.
+
+Everything else is a leaf: no other lock is taken inside the critical
+sections.
+
+2. Exception
+------------
+
+Fast page fault:
+
+Fast page fault is the fast path which fixes the guest page fault out of
+the mmu-lock on x86. Currently, the page fault can be fast in one of the
+following two cases:
+
+1. Access Tracking: The SPTE is not present, but it is marked for access
+   tracking i.e. the SPTE_SPECIAL_MASK is set. That means we need to
+   restore the saved R/X bits. This is described in more detail later below.
+
+2. Write-Protection: The SPTE is present and the fault is
+   caused by write-protect. That means we just need to change the W bit of
+   the spte.
+
+What we use to avoid all the race is the SPTE_HOST_WRITEABLE bit and
+SPTE_MMU_WRITEABLE bit on the spte:
+
+- SPTE_HOST_WRITEABLE means the gfn is writable on host.
+- SPTE_MMU_WRITEABLE means the gfn is writable on mmu. The bit is set when
+  the gfn is writable on guest mmu and it is not write-protected by shadow
+  page write-protection.
+
+On fast page fault path, we will use cmpxchg to atomically set the spte W
+bit if spte.SPTE_HOST_WRITEABLE = 1 and spte.SPTE_WRITE_PROTECT = 1, or
+restore the saved R/X bits if VMX_EPT_TRACK_ACCESS mask is set, or both. This
+is safe because whenever changing these bits can be detected by cmpxchg.
+
+But we need carefully check these cases:
+
+1) The mapping from gfn to pfn
+
+The mapping from gfn to pfn may be changed since we can only ensure the pfn
+is not changed during cmpxchg. This is a ABA problem, for example, below case
+will happen:
+
++------------------------------------------------------------------------+
+| At the beginning::                                                     |
+|                                                                        |
+|      gpte = gfn1                                                      |
+|      gfn1 is mapped to pfn1 on host                                   |
+|      spte is the shadow page table entry corresponding with gpte and  |
+|      spte = pfn1                                                      |
++------------------------------------------------------------------------+
+| On fast page fault path:                                               |
++------------------------------------+-----------------------------------+
+| CPU 0:                             | CPU 1:                            |
++------------------------------------+-----------------------------------+
+| ::                                 |                                   |
+|                                    |                                   |
+|   old_spte = *spte;                |                                   |
++------------------------------------+-----------------------------------+
+|                                    | pfn1 is swapped out::             |
+|                                    |                                   |
+|                                    |    spte = 0;                      |
+|                                    |                                   |
+|                                    | pfn1 is re-alloced for gfn2.      |
+|                                    |                                   |
+|                                    | gpte is changed to point to       |
+|                                    | gfn2 by the guest::               |
+|                                    |                                   |
+|                                    |    spte = pfn1;                   |
++------------------------------------+-----------------------------------+
+| ::                                                                     |
+|                                                                        |
+|   if (cmpxchg(spte, old_spte, old_spte+W)                              |
+|      mark_page_dirty(vcpu->kvm, gfn1)                                 |
+|            OOPS!!!                                                     |
++------------------------------------------------------------------------+
+
+We dirty-log for gfn1, that means gfn2 is lost in dirty-bitmap.
+
+For direct sp, we can easily avoid it since the spte of direct sp is fixed
+to gfn. For indirect sp, before we do cmpxchg, we call gfn_to_pfn_atomic()
+to pin gfn to pfn, because after gfn_to_pfn_atomic():
+
+- We have held the refcount of pfn that means the pfn can not be freed and
+  be reused for another gfn.
+- The pfn is writable that means it can not be shared between different gfns
+  by KSM.
+
+Then, we can ensure the dirty bitmaps is correctly set for a gfn.
+
+Currently, to simplify the whole things, we disable fast page fault for
+indirect shadow page.
+
+2) Dirty bit tracking
+
+In the origin code, the spte can be fast updated (non-atomically) if the
+spte is read-only and the Accessed bit has already been set since the
+Accessed bit and Dirty bit can not be lost.
+
+But it is not true after fast page fault since the spte can be marked
+writable between reading spte and updating spte. Like below case:
+
++------------------------------------------------------------------------+
+| At the beginning::                                                     |
+|                                                                        |
+|      spte.W = 0                                                       |
+|      spte.Accessed = 1                                                |
++------------------------------------+-----------------------------------+
+| CPU 0:                             | CPU 1:                            |
++------------------------------------+-----------------------------------+
+| In mmu_spte_clear_track_bits()::   |                                   |
+|                                    |                                   |
+|  old_spte = *spte;                 |                                   |
+|                                    |                                   |
+|                                    |                                   |
+|  /* 'if' condition is satisfied. */|                                   |
+|  if (old_spte.Accessed == 1 &&     |                                   |
+|       old_spte.W == 0)             |                                   |
+|     spte = 0ull;                   |                                   |
++------------------------------------+-----------------------------------+
+|                                    | on fast page fault path::         |
+|                                    |                                   |
+|                                    |    spte.W = 1                     |
+|                                    |                                   |
+|                                    | memory write on the spte::        |
+|                                    |                                   |
+|                                    |    spte.Dirty = 1                 |
++------------------------------------+-----------------------------------+
+|  ::                                |                                   |
+|                                    |                                   |
+|   else                             |                                   |
+|     old_spte = xchg(spte, 0ull)    |                                   |
+|   if (old_spte.Accessed == 1)      |                                   |
+|     kvm_set_pfn_accessed(spte.pfn);|                                   |
+|   if (old_spte.Dirty == 1)         |                                   |
+|     kvm_set_pfn_dirty(spte.pfn);   |                                   |
+|     OOPS!!!                        |                                   |
++------------------------------------+-----------------------------------+
+
+The Dirty bit is lost in this case.
+
+In order to avoid this kind of issue, we always treat the spte as "volatile"
+if it can be updated out of mmu-lock, see spte_has_volatile_bits(), it means,
+the spte is always atomically updated in this case.
+
+3) flush tlbs due to spte updated
+
+If the spte is updated from writable to readonly, we should flush all TLBs,
+otherwise rmap_write_protect will find a read-only spte, even though the
+writable spte might be cached on a CPU's TLB.
+
+As mentioned before, the spte can be updated to writable out of mmu-lock on
+fast page fault path, in order to easily audit the path, we see if TLBs need
+be flushed caused by this reason in mmu_spte_update() since this is a common
+function to update spte (present -> present).
+
+Since the spte is "volatile" if it can be updated out of mmu-lock, we always
+atomically update the spte, the race caused by fast page fault can be avoided,
+See the comments in spte_has_volatile_bits() and mmu_spte_update().
+
+Lockless Access Tracking:
+
+This is used for Intel CPUs that are using EPT but do not support the EPT A/D
+bits. In this case, when the KVM MMU notifier is called to track accesses to a
+page (via kvm_mmu_notifier_clear_flush_young), it marks the PTE as not-present
+by clearing the RWX bits in the PTE and storing the original R & X bits in
+some unused/ignored bits. In addition, the SPTE_SPECIAL_MASK is also set on the
+PTE (using the ignored bit 62). When the VM tries to access the page later on,
+a fault is generated and the fast page fault mechanism described above is used
+to atomically restore the PTE to a Present state. The W bit is not saved when
+the PTE is marked for access tracking and during restoration to the Present
+state, the W bit is set depending on whether or not it was a write access. If
+it wasn't, then the W bit will remain clear until a write access happens, at
+which time it will be set using the Dirty tracking mechanism described above.
+
+3. Reference
+------------
+
+:Name:         kvm_lock
+:Type:         mutex
+:Arch:         any
+:Protects:     - vm_list
+
+:Name:         kvm_count_lock
+:Type:         raw_spinlock_t
+:Arch:         any
+:Protects:     - hardware virtualization enable/disable
+:Comment:      'raw' because hardware enabling/disabling must be atomic /wrt
+               migration.
+
+:Name:         kvm_arch::tsc_write_lock
+:Type:         raw_spinlock
+:Arch:         x86
+:Protects:     - kvm_arch::{last_tsc_write,last_tsc_nsec,last_tsc_offset}
+               - tsc offset in vmcb
+:Comment:      'raw' because updating the tsc offsets must not be preempted.
+
+:Name:         kvm->mmu_lock
+:Type:         spinlock_t
+:Arch:         any
+:Protects:     -shadow page/shadow tlb entry
+:Comment:      it is a spinlock since it is used in mmu notifier.
+
+:Name:         kvm->srcu
+:Type:         srcu lock
+:Arch:         any
+:Protects:     - kvm->memslots
+               - kvm->buses
+:Comment:      The srcu read lock must be held while accessing memslots (e.g.
+               when using gfn_to_* functions) and while accessing in-kernel
+               MMIO/PIO address->device structure mapping (kvm->buses).
+               The srcu index can be stored in kvm_vcpu->srcu_idx per vcpu
+               if it is needed by multiple functions.
+
+:Name:         blocked_vcpu_on_cpu_lock
+:Type:         spinlock_t
+:Arch:         x86
+:Protects:     blocked_vcpu_on_cpu
+:Comment:      This is a per-CPU lock and it is used for VT-d posted-interrupts.
+               When VT-d posted-interrupts is supported and the VM has assigned
+               devices, we put the blocked vCPU on the list blocked_vcpu_on_cpu
+               protected by blocked_vcpu_on_cpu_lock, when VT-d hardware issues
+               wakeup notification event since external interrupts from the
+               assigned devices happens, we will find the vCPU on the list to
+               wakeup.
diff --git a/Documentation/virt/kvm/locking.txt b/Documentation/virt/kvm/locking.txt

deleted file mode 100644 (file)

index 635cd6e..0000000
--- a/Documentation/virt/kvm/locking.txt
+++ /dev/null
@@ -1,215 +0,0 @@
-KVM Lock Overview
-=================
-
-1. Acquisition Orders
----------------------
-
-The acquisition orders for mutexes are as follows:
-
-- kvm->lock is taken outside vcpu->mutex
-
-- kvm->lock is taken outside kvm->slots_lock and kvm->irq_lock
-
-- kvm->slots_lock is taken outside kvm->irq_lock, though acquiring
-  them together is quite rare.
-
-On x86, vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock.
-
-Everything else is a leaf: no other lock is taken inside the critical
-sections.
-
-2: Exception
-------------
-
-Fast page fault:
-
-Fast page fault is the fast path which fixes the guest page fault out of
-the mmu-lock on x86. Currently, the page fault can be fast in one of the
-following two cases:
-
-1. Access Tracking: The SPTE is not present, but it is marked for access
-tracking i.e. the SPTE_SPECIAL_MASK is set. That means we need to
-restore the saved R/X bits. This is described in more detail later below.
-
-2. Write-Protection: The SPTE is present and the fault is
-caused by write-protect. That means we just need to change the W bit of the 
-spte.
-
-What we use to avoid all the race is the SPTE_HOST_WRITEABLE bit and
-SPTE_MMU_WRITEABLE bit on the spte:
-- SPTE_HOST_WRITEABLE means the gfn is writable on host.
-- SPTE_MMU_WRITEABLE means the gfn is writable on mmu. The bit is set when
-  the gfn is writable on guest mmu and it is not write-protected by shadow
-  page write-protection.
-
-On fast page fault path, we will use cmpxchg to atomically set the spte W
-bit if spte.SPTE_HOST_WRITEABLE = 1 and spte.SPTE_WRITE_PROTECT = 1, or 
-restore the saved R/X bits if VMX_EPT_TRACK_ACCESS mask is set, or both. This
-is safe because whenever changing these bits can be detected by cmpxchg.
-
-But we need carefully check these cases:
-1): The mapping from gfn to pfn
-The mapping from gfn to pfn may be changed since we can only ensure the pfn
-is not changed during cmpxchg. This is a ABA problem, for example, below case
-will happen:
-
-At the beginning:
-gpte = gfn1
-gfn1 is mapped to pfn1 on host
-spte is the shadow page table entry corresponding with gpte and
-spte = pfn1
-
-   VCPU 0                           VCPU0
-on fast page fault path:
-
-   old_spte = *spte;
-                                 pfn1 is swapped out:
-                                    spte = 0;
-
-                                 pfn1 is re-alloced for gfn2.
-
-                                 gpte is changed to point to
-                                 gfn2 by the guest:
-                                    spte = pfn1;
-
-   if (cmpxchg(spte, old_spte, old_spte+W)
-       mark_page_dirty(vcpu->kvm, gfn1)
-             OOPS!!!
-
-We dirty-log for gfn1, that means gfn2 is lost in dirty-bitmap.
-
-For direct sp, we can easily avoid it since the spte of direct sp is fixed
-to gfn. For indirect sp, before we do cmpxchg, we call gfn_to_pfn_atomic()
-to pin gfn to pfn, because after gfn_to_pfn_atomic():
-- We have held the refcount of pfn that means the pfn can not be freed and
-  be reused for another gfn.
-- The pfn is writable that means it can not be shared between different gfns
-  by KSM.
-
-Then, we can ensure the dirty bitmaps is correctly set for a gfn.
-
-Currently, to simplify the whole things, we disable fast page fault for
-indirect shadow page.
-
-2): Dirty bit tracking
-In the origin code, the spte can be fast updated (non-atomically) if the
-spte is read-only and the Accessed bit has already been set since the
-Accessed bit and Dirty bit can not be lost.
-
-But it is not true after fast page fault since the spte can be marked
-writable between reading spte and updating spte. Like below case:
-
-At the beginning:
-spte.W = 0
-spte.Accessed = 1
-
-   VCPU 0                                       VCPU0
-In mmu_spte_clear_track_bits():
-
-   old_spte = *spte;
-
-   /* 'if' condition is satisfied. */
-   if (old_spte.Accessed == 1 &&
-        old_spte.W == 0)
-      spte = 0ull;
-                                         on fast page fault path:
-                                             spte.W = 1
-                                         memory write on the spte:
-                                             spte.Dirty = 1
-
-
-   else
-      old_spte = xchg(spte, 0ull)
-
-
-   if (old_spte.Accessed == 1)
-      kvm_set_pfn_accessed(spte.pfn);
-   if (old_spte.Dirty == 1)
-      kvm_set_pfn_dirty(spte.pfn);
-      OOPS!!!
-
-The Dirty bit is lost in this case.
-
-In order to avoid this kind of issue, we always treat the spte as "volatile"
-if it can be updated out of mmu-lock, see spte_has_volatile_bits(), it means,
-the spte is always atomically updated in this case.
-
-3): flush tlbs due to spte updated
-If the spte is updated from writable to readonly, we should flush all TLBs,
-otherwise rmap_write_protect will find a read-only spte, even though the
-writable spte might be cached on a CPU's TLB.
-
-As mentioned before, the spte can be updated to writable out of mmu-lock on
-fast page fault path, in order to easily audit the path, we see if TLBs need
-be flushed caused by this reason in mmu_spte_update() since this is a common
-function to update spte (present -> present).
-
-Since the spte is "volatile" if it can be updated out of mmu-lock, we always
-atomically update the spte, the race caused by fast page fault can be avoided,
-See the comments in spte_has_volatile_bits() and mmu_spte_update().
-
-Lockless Access Tracking:
-
-This is used for Intel CPUs that are using EPT but do not support the EPT A/D
-bits. In this case, when the KVM MMU notifier is called to track accesses to a
-page (via kvm_mmu_notifier_clear_flush_young), it marks the PTE as not-present
-by clearing the RWX bits in the PTE and storing the original R & X bits in
-some unused/ignored bits. In addition, the SPTE_SPECIAL_MASK is also set on the
-PTE (using the ignored bit 62). When the VM tries to access the page later on,
-a fault is generated and the fast page fault mechanism described above is used
-to atomically restore the PTE to a Present state. The W bit is not saved when
-the PTE is marked for access tracking and during restoration to the Present
-state, the W bit is set depending on whether or not it was a write access. If
-it wasn't, then the W bit will remain clear until a write access happens, at 
-which time it will be set using the Dirty tracking mechanism described above.
-
-3. Reference
-------------
-
-Name:          kvm_lock
-Type:          mutex
-Arch:          any
-Protects:      - vm_list
-
-Name:          kvm_count_lock
-Type:          raw_spinlock_t
-Arch:          any
-Protects:      - hardware virtualization enable/disable
-Comment:       'raw' because hardware enabling/disabling must be atomic /wrt
-               migration.
-
-Name:          kvm_arch::tsc_write_lock
-Type:          raw_spinlock
-Arch:          x86
-Protects:      - kvm_arch::{last_tsc_write,last_tsc_nsec,last_tsc_offset}
-               - tsc offset in vmcb
-Comment:       'raw' because updating the tsc offsets must not be preempted.
-
-Name:          kvm->mmu_lock
-Type:          spinlock_t
-Arch:          any
-Protects:      -shadow page/shadow tlb entry
-Comment:       it is a spinlock since it is used in mmu notifier.
-
-Name:          kvm->srcu
-Type:          srcu lock
-Arch:          any
-Protects:      - kvm->memslots
-               - kvm->buses
-Comment:       The srcu read lock must be held while accessing memslots (e.g.
-               when using gfn_to_* functions) and while accessing in-kernel
-               MMIO/PIO address->device structure mapping (kvm->buses).
-               The srcu index can be stored in kvm_vcpu->srcu_idx per vcpu
-               if it is needed by multiple functions.
-
-Name:          blocked_vcpu_on_cpu_lock
-Type:          spinlock_t
-Arch:          x86
-Protects:      blocked_vcpu_on_cpu
-Comment:       This is a per-CPU lock and it is used for VT-d posted-interrupts.
-               When VT-d posted-interrupts is supported and the VM has assigned
-               devices, we put the blocked vCPU on the list blocked_vcpu_on_cpu
-               protected by blocked_vcpu_on_cpu_lock, when VT-d hardware issues
-               wakeup notification event since external interrupts from the
-               assigned devices happens, we will find the vCPU on the list to
-               wakeup.
diff --git a/Documentation/virt/kvm/mmu.txt b/Documentation/virt/kvm/mmu.rst

similarity index 94%

rename from Documentation/virt/kvm/mmu.txt

rename to Documentation/virt/kvm/mmu.rst

index dadb29e8738fea131243a6cb8567a5fd5b6895c1..60981887d20b847ee5fa71719a04d44e6d2b8a65 100644 (file)
--- a/Documentation/virt/kvm/mmu.txt
+++ b/Documentation/virt/kvm/mmu.rst
@@ -1,3 +1,6 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+======================
  The x86 kvm shadow mmu
  ======================
  
@@ -7,27 +10,37 @@ physical addresses to host physical addresses.
  
  The mmu code attempts to satisfy the following requirements:
  
-- correctness: the guest should not be able to determine that it is running
+- correctness:
+              the guest should not be able to determine that it is running
                 on an emulated mmu except for timing (we attempt to comply
                 with the specification, not emulate the characteristics of
                 a particular implementation such as tlb size)
-- security:    the guest must not be able to touch host memory not assigned
+- security:
+              the guest must not be able to touch host memory not assigned
                 to it
-- performance: minimize the performance penalty imposed by the mmu
-- scaling:     need to scale to large memory and large vcpu guests
-- hardware:    support the full range of x86 virtualization hardware
-- integration: Linux memory management code must be in control of guest memory
+- performance:
+               minimize the performance penalty imposed by the mmu
+- scaling:
+               need to scale to large memory and large vcpu guests
+- hardware:
+               support the full range of x86 virtualization hardware
+- integration:
+               Linux memory management code must be in control of guest memory
                 so that swapping, page migration, page merging, transparent
                 hugepages, and similar features work without change
-- dirty tracking: report writes to guest memory to enable live migration
+- dirty tracking:
+               report writes to guest memory to enable live migration
                 and framebuffer-based displays
-- footprint:   keep the amount of pinned kernel memory low (most memory
+- footprint:
+               keep the amount of pinned kernel memory low (most memory
                 should be shrinkable)
-- reliability:  avoid multipage or GFP_ATOMIC allocations
+- reliability:
+               avoid multipage or GFP_ATOMIC allocations
  
  Acronyms
  ========
  
+====  ====================================================================
  pfn   host page frame number
  hpa   host physical address
  hva   host virtual address
@@ -41,6 +54,7 @@ pte   page table entry (used also to refer generically to paging structure
  gpte  guest pte (referring to gfns)
  spte  shadow pte (referring to pfns)
  tdp   two dimensional paging (vendor neutral term for NPT and EPT)
+====  ====================================================================
  
  Virtual and real hardware supported
  ===================================
@@ -90,11 +104,13 @@ Events
  The mmu is driven by events, some from the guest, some from the host.
  
  Guest generated events:
+
  - writes to control registers (especially cr3)
  - invlpg/invlpga instruction execution
  - access to missing or protected translations
  
  Host generated events:
+
  - changes in the gpa->hpa translation (either through gpa->hva changes or
    through hva->hpa changes)
  - memory pressure (the shrinker)
@@ -117,16 +133,19 @@ Leaf ptes point at guest pages.
  The following table shows translations encoded by leaf ptes, with higher-level
  translations in parentheses:
  
- Non-nested guests:
+ Non-nested guests::
+
    nonpaging:     gpa->hpa
    paging:        gva->gpa->hpa
    paging, tdp:   (gva->)gpa->hpa
- Nested guests:
+
+ Nested guests::
+
    non-tdp:       ngva->gpa->hpa  (*)
    tdp:           (ngva->)ngpa->gpa->hpa
  
-(*) the guest hypervisor will encode the ngva->gpa translation into its page
-    tables if npt is not present
+  (*) the guest hypervisor will encode the ngva->gpa translation into its page
+      tables if npt is not present
  
  Shadow pages contain the following information:
    role.level:
@@ -291,28 +310,41 @@ Handling a page fault is performed as follows:
  
   - if the RSV bit of the error code is set, the page fault is caused by guest
     accessing MMIO and cached MMIO information is available.
+
     - walk shadow page table
     - check for valid generation number in the spte (see "Fast invalidation of
       MMIO sptes" below)
     - cache the information to vcpu->arch.mmio_gva, vcpu->arch.mmio_access and
       vcpu->arch.mmio_gfn, and call the emulator
+
   - If both P bit and R/W bit of error code are set, this could possibly
     be handled as a "fast page fault" (fixed without taking the MMU lock).  See
     the description in Documentation/virt/kvm/locking.txt.
+
   - if needed, walk the guest page tables to determine the guest translation
     (gva->gpa or ngpa->gpa)
+
     - if permissions are insufficient, reflect the fault back to the guest
+
   - determine the host page
+
     - if this is an mmio request, there is no host page; cache the info to
       vcpu->arch.mmio_gva, vcpu->arch.mmio_access and vcpu->arch.mmio_gfn
+
   - walk the shadow page table to find the spte for the translation,
     instantiating missing intermediate page tables as necessary
+
     - If this is an mmio request, cache the mmio info to the spte and set some
       reserved bit on the spte (see callers of kvm_mmu_set_mmio_spte_mask)
+
   - try to unsynchronize the page
+
     - if successful, we can let the guest continue and modify the gpte
+
   - emulate the instruction
+
     - if failed, unshadow the page and let the guest continue
+
   - update any translations that were modified by the instruction
  
  invlpg handling:
@@ -324,10 +356,12 @@ invlpg handling:
  Guest control register updates:
  
  - mov to cr3
+
    - look up new shadow roots
    - synchronize newly reachable shadow pages
  
  - mov to cr0/cr4/efer
+
    - set up mmu context for new paging mode
    - look up new shadow roots
    - synchronize newly reachable shadow pages
@@ -358,6 +392,7 @@ on fault type:
  (user write faults generate a #PF)
  
  In the first case there are two additional complications:
+
  - if CR4.SMEP is enabled: since we've turned the page into a kernel page,
    the kernel may now execute it.  We handle this by also setting spte.nx.
    If we get a user fetch or read fault, we'll change spte.u=1 and
@@ -446,4 +481,3 @@ Further reading
  
  - NPT presentation from KVM Forum 2008
    http://www.linux-kvm.org/images/c/c8/KvmForum2008%24kdf2008_21.pdf
-
diff --git a/Documentation/virt/kvm/msr.txt b/Documentation/virt/kvm/msr.rst

similarity index 74%

rename from Documentation/virt/kvm/msr.txt

rename to Documentation/virt/kvm/msr.rst

index df1f4338b3caf3e466e783fec42b767cef1ad72c..33892036672d4facfb07294ef09bc29adfe648fe 100644 (file)
--- a/Documentation/virt/kvm/msr.txt
+++ b/Documentation/virt/kvm/msr.rst
@@ -1,6 +1,10 @@
-KVM-specific MSRs.
-Glauber Costa <glommer@redhat.com>, Red Hat Inc, 2010
-=====================================================
+.. SPDX-License-Identifier: GPL-2.0
+
+=================
+KVM-specific MSRs
+=================
+
+:Author: Glauber Costa <glommer@redhat.com>, Red Hat Inc, 2010
  
  KVM makes use of some custom MSRs to service some requests.
  
@@ -9,34 +13,39 @@ Custom MSRs have a range reserved for them, that goes from
  but they are deprecated and their use is discouraged.
  
  Custom MSR list
---------
+---------------
  
  The current supported Custom MSR list is:
  
-MSR_KVM_WALL_CLOCK_NEW:   0x4b564d00
+MSR_KVM_WALL_CLOCK_NEW:
+       0x4b564d00
  
-       data: 4-byte alignment physical address of a memory area which must be
+data:
+       4-byte alignment physical address of a memory area which must be
         in guest RAM. This memory is expected to hold a copy of the following
-       structure:
+       structure::
  
-       struct pvclock_wall_clock {
+        struct pvclock_wall_clock {
                 u32   version;
                 u32   sec;
                 u32   nsec;
-       } __attribute__((__packed__));
+         } __attribute__((__packed__));
  
         whose data will be filled in by the hypervisor. The hypervisor is only
         guaranteed to update this data at the moment of MSR write.
         Users that want to reliably query this information more than once have
         to write more than once to this MSR. Fields have the following meanings:
  
-               version: guest has to check version before and after grabbing
+       version:
+               guest has to check version before and after grabbing
                 time information and check that they are both equal and even.
                 An odd version indicates an in-progress update.
  
-               sec: number of seconds for wallclock at time of boot.
+       sec:
+                number of seconds for wallclock at time of boot.
  
-               nsec: number of nanoseconds for wallclock at time of boot.
+       nsec:
+                number of nanoseconds for wallclock at time of boot.
  
         In order to get the current wallclock time, the system_time from
         MSR_KVM_SYSTEM_TIME_NEW needs to be added.
@@ -47,13 +56,15 @@ MSR_KVM_WALL_CLOCK_NEW:   0x4b564d00
         Availability of this MSR must be checked via bit 3 in 0x4000001 cpuid
         leaf prior to usage.
  
-MSR_KVM_SYSTEM_TIME_NEW:  0x4b564d01
+MSR_KVM_SYSTEM_TIME_NEW:
+       0x4b564d01
  
-       data: 4-byte aligned physical address of a memory area which must be in
+data:
+       4-byte aligned physical address of a memory area which must be in
         guest RAM, plus an enable bit in bit 0. This memory is expected to hold
-       a copy of the following structure:
+       a copy of the following structure::
  
-       struct pvclock_vcpu_time_info {
+         struct pvclock_vcpu_time_info {
                 u32   version;
                 u32   pad0;
                 u64   tsc_timestamp;
@@ -62,7 +73,7 @@ MSR_KVM_SYSTEM_TIME_NEW:  0x4b564d01
                 s8    tsc_shift;
                 u8    flags;
                 u8    pad[2];
-       } __attribute__((__packed__)); /* 32 bytes */
+         } __attribute__((__packed__)); /* 32 bytes */
  
         whose data will be filled in by the hypervisor periodically. Only one
         write, or registration, is needed for each VCPU. The interval between
@@ -72,23 +83,28 @@ MSR_KVM_SYSTEM_TIME_NEW:  0x4b564d01
  
         Fields have the following meanings:
  
-               version: guest has to check version before and after grabbing
+       version:
+               guest has to check version before and after grabbing
                 time information and check that they are both equal and even.
                 An odd version indicates an in-progress update.
  
-               tsc_timestamp: the tsc value at the current VCPU at the time
+       tsc_timestamp:
+               the tsc value at the current VCPU at the time
                 of the update of this structure. Guests can subtract this value
                 from current tsc to derive a notion of elapsed time since the
                 structure update.
  
-               system_time: a host notion of monotonic time, including sleep
+       system_time:
+               a host notion of monotonic time, including sleep
                 time at the time this structure was last updated. Unit is
                 nanoseconds.
  
-               tsc_to_system_mul: multiplier to be used when converting
+       tsc_to_system_mul:
+               multiplier to be used when converting
                 tsc-related quantity to nanoseconds
  
-               tsc_shift: shift to be used when converting tsc-related
+       tsc_shift:
+               shift to be used when converting tsc-related
                 quantity to nanoseconds. This shift will ensure that
                 multiplication with tsc_to_system_mul does not overflow.
                 A positive value denotes a left shift, a negative value
@@ -96,7 +112,7 @@ MSR_KVM_SYSTEM_TIME_NEW:  0x4b564d01
  
                 The conversion from tsc to nanoseconds involves an additional
                 right shift by 32 bits. With this information, guests can
-               derive per-CPU time by doing:
+               derive per-CPU time by doing::
  
                         time = (current_tsc - tsc_timestamp)
                         if (tsc_shift >= 0)
@@ -106,29 +122,34 @@ MSR_KVM_SYSTEM_TIME_NEW:  0x4b564d01
                         time = (time * tsc_to_system_mul) >> 32
                         time = time + system_time
  
-               flags: bits in this field indicate extended capabilities
+       flags:
+               bits in this field indicate extended capabilities
                 coordinated between the guest and the hypervisor. Availability
                 of specific flags has to be checked in 0x40000001 cpuid leaf.
                 Current flags are:
  
-                flag bit   | cpuid bit    | meaning
-               -------------------------------------------------------------
-                           |              | time measures taken across
-                    0      |      24      | multiple cpus are guaranteed to
-                           |              | be monotonic
-               -------------------------------------------------------------
-                           |              | guest vcpu has been paused by
-                    1      |     N/A      | the host
-                           |              | See 4.70 in api.txt
-               -------------------------------------------------------------
+
+               +-----------+--------------+----------------------------------+
+               | flag bit  | cpuid bit    | meaning                          |
+               +-----------+--------------+----------------------------------+
+               |           |              | time measures taken across       |
+               |    0      |      24      | multiple cpus are guaranteed to  |
+               |           |              | be monotonic                     |
+               +-----------+--------------+----------------------------------+
+               |           |              | guest vcpu has been paused by    |
+               |    1      |     N/A      | the host                         |
+               |           |              | See 4.70 in api.txt              |
+               +-----------+--------------+----------------------------------+
  
         Availability of this MSR must be checked via bit 3 in 0x4000001 cpuid
         leaf prior to usage.
  
  
-MSR_KVM_WALL_CLOCK:  0x11
+MSR_KVM_WALL_CLOCK:
+       0x11
  
-       data and functioning: same as MSR_KVM_WALL_CLOCK_NEW. Use that instead.
+data and functioning:
+       same as MSR_KVM_WALL_CLOCK_NEW. Use that instead.
  
         This MSR falls outside the reserved KVM range and may be removed in the
         future. Its usage is deprecated.
@@ -136,9 +157,11 @@ MSR_KVM_WALL_CLOCK:  0x11
         Availability of this MSR must be checked via bit 0 in 0x4000001 cpuid
         leaf prior to usage.
  
-MSR_KVM_SYSTEM_TIME: 0x12
+MSR_KVM_SYSTEM_TIME:
+       0x12
  
-       data and functioning: same as MSR_KVM_SYSTEM_TIME_NEW. Use that instead.
+data and functioning:
+       same as MSR_KVM_SYSTEM_TIME_NEW. Use that instead.
  
         This MSR falls outside the reserved KVM range and may be removed in the
         future. Its usage is deprecated.
@@ -146,7 +169,7 @@ MSR_KVM_SYSTEM_TIME: 0x12
         Availability of this MSR must be checked via bit 0 in 0x4000001 cpuid
         leaf prior to usage.
  
-       The suggested algorithm for detecting kvmclock presence is then:
+       The suggested algorithm for detecting kvmclock presence is then::
  
                 if (!kvm_para_available())    /* refer to cpuid.txt */
                         return NON_PRESENT;
@@ -163,8 +186,11 @@ MSR_KVM_SYSTEM_TIME: 0x12
                 } else
                         return NON_PRESENT;
  
-MSR_KVM_ASYNC_PF_EN: 0x4b564d02
-       data: Bits 63-6 hold 64-byte aligned physical address of a
+MSR_KVM_ASYNC_PF_EN:
+       0x4b564d02
+
+data:
+       Bits 63-6 hold 64-byte aligned physical address of a
         64 byte memory area which must be in guest RAM and must be
         zeroed. Bits 5-3 are reserved and should be zero. Bit 0 is 1
         when asynchronous page faults are enabled on the vcpu 0 when
@@ -200,20 +226,22 @@ MSR_KVM_ASYNC_PF_EN: 0x4b564d02
         Currently type 2 APF will be always delivered on the same vcpu as
         type 1 was, but guest should not rely on that.
  
-MSR_KVM_STEAL_TIME: 0x4b564d03
+MSR_KVM_STEAL_TIME:
+       0x4b564d03
  
-       data: 64-byte alignment physical address of a memory area which must be
+data:
+       64-byte alignment physical address of a memory area which must be
         in guest RAM, plus an enable bit in bit 0. This memory is expected to
-       hold a copy of the following structure:
+       hold a copy of the following structure::
  
-       struct kvm_steal_time {
+         struct kvm_steal_time {
                 __u64 steal;
                 __u32 version;
                 __u32 flags;
                 __u8  preempted;
                 __u8  u8_pad[3];
                 __u32 pad[11];
-       }
+         }
  
         whose data will be filled in by the hypervisor periodically. Only one
         write, or registration, is needed for each VCPU. The interval between
@@ -224,25 +252,32 @@ MSR_KVM_STEAL_TIME: 0x4b564d03
  
         Fields have the following meanings:
  
-               version: a sequence counter. In other words, guest has to check
+       version:
+               a sequence counter. In other words, guest has to check
                 this field before and after grabbing time information and make
                 sure they are both equal and even. An odd version indicates an
                 in-progress update.
  
-               flags: At this point, always zero. May be used to indicate
+       flags:
+               At this point, always zero. May be used to indicate
                 changes in this structure in the future.
  
-               steal: the amount of time in which this vCPU did not run, in
+       steal:
+               the amount of time in which this vCPU did not run, in
                 nanoseconds. Time during which the vcpu is idle, will not be
                 reported as steal time.
  
-               preempted: indicate the vCPU who owns this struct is running or
+       preempted:
+               indicate the vCPU who owns this struct is running or
                 not. Non-zero values mean the vCPU has been preempted. Zero
                 means the vCPU is not preempted. NOTE, it is always zero if the
                 the hypervisor doesn't support this field.
  
-MSR_KVM_EOI_EN: 0x4b564d04
-       data: Bit 0 is 1 when PV end of interrupt is enabled on the vcpu; 0
+MSR_KVM_EOI_EN:
+       0x4b564d04
+
+data:
+       Bit 0 is 1 when PV end of interrupt is enabled on the vcpu; 0
         when disabled.  Bit 1 is reserved and must be zero.  When PV end of
         interrupt is enabled (bit 0 set), bits 63-2 hold a 4-byte aligned
         physical address of a 4 byte memory area which must be in guest RAM and
@@ -274,11 +309,13 @@ MSR_KVM_EOI_EN: 0x4b564d04
         clear it using a single CPU instruction, such as test and clear, or
         compare and exchange.
  
-MSR_KVM_POLL_CONTROL: 0x4b564d05
+MSR_KVM_POLL_CONTROL:
+       0x4b564d05
+
         Control host-side polling.
  
-       data: Bit 0 enables (1) or disables (0) host-side HLT polling logic.
+data:
+       Bit 0 enables (1) or disables (0) host-side HLT polling logic.
  
         KVM guests can request the host not to poll on HLT, for example if
         they are performing polling themselves.
-
diff --git a/Documentation/virt/kvm/nested-vmx.txt b/Documentation/virt/kvm/nested-vmx.rst

similarity index 90%

rename from Documentation/virt/kvm/nested-vmx.txt

rename to Documentation/virt/kvm/nested-vmx.rst

index 97eb1353e9624463d846dee3f944ac69c6b1704a..592b0ab6970b14fed9f5f57e8790a426a331695b 100644 (file)
--- a/Documentation/virt/kvm/nested-vmx.txt
+++ b/Documentation/virt/kvm/nested-vmx.rst
@@ -1,3 +1,6 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==========
  Nested VMX
  ==========
  
@@ -41,9 +44,9 @@ No modifications are required to user space (qemu). However, qemu's default
  emulated CPU type (qemu64) does not list the "VMX" CPU feature, so it must be
  explicitly enabled, by giving qemu one of the following options:
  
-     -cpu host              (emulated CPU has all features of the real CPU)
+     - cpu host              (emulated CPU has all features of the real CPU)
  
-     -cpu qemu64,+vmx       (add just the vmx feature to a named CPU type)
+     - cpu qemu64,+vmx       (add just the vmx feature to a named CPU type)
  
  
  ABIs
@@ -75,6 +78,8 @@ of this structure changes, this can break live migration across KVM versions.
  VMCS12_REVISION (from vmx.c) should be changed if struct vmcs12 or its inner
  struct shadow_vmcs is ever changed.
  
+::
+
         typedef u64 natural_width;
         struct __packed vmcs12 {
                 /* According to the Intel spec, a VMCS region must start with
@@ -220,21 +225,21 @@ Authors
  -------
  
  These patches were written by:
-     Abel Gordon, abelg <at> il.ibm.com
-     Nadav Har'El, nyh <at> il.ibm.com
-     Orit Wasserman, oritw <at> il.ibm.com
-     Ben-Ami Yassor, benami <at> il.ibm.com
-     Muli Ben-Yehuda, muli <at> il.ibm.com
+    - Abel Gordon, abelg <at> il.ibm.com
+    - Nadav Har'El, nyh <at> il.ibm.com
+    - Orit Wasserman, oritw <at> il.ibm.com
+    - Ben-Ami Yassor, benami <at> il.ibm.com
+    - Muli Ben-Yehuda, muli <at> il.ibm.com
  
  With contributions by:
-     Anthony Liguori, aliguori <at> us.ibm.com
-     Mike Day, mdday <at> us.ibm.com
-     Michael Factor, factor <at> il.ibm.com
-     Zvi Dubitzky, dubi <at> il.ibm.com
+    - Anthony Liguori, aliguori <at> us.ibm.com
+    - Mike Day, mdday <at> us.ibm.com
+    - Michael Factor, factor <at> il.ibm.com
+    - Zvi Dubitzky, dubi <at> il.ibm.com
  
  And valuable reviews by:
-     Avi Kivity, avi <at> redhat.com
-     Gleb Natapov, gleb <at> redhat.com
-     Marcelo Tosatti, mtosatti <at> redhat.com
-     Kevin Tian, kevin.tian <at> intel.com
-     and others.
+    - Avi Kivity, avi <at> redhat.com
+    - Gleb Natapov, gleb <at> redhat.com
+    - Marcelo Tosatti, mtosatti <at> redhat.com
+    - Kevin Tian, kevin.tian <at> intel.com
+    - and others.
diff --git a/Documentation/virt/kvm/ppc-pv.txt b/Documentation/virt/kvm/ppc-pv.rst

similarity index 91%

rename from Documentation/virt/kvm/ppc-pv.txt

rename to Documentation/virt/kvm/ppc-pv.rst

index e26115ce4258bdefb369a73ecb3371108bce2393..5fdb907670be01ecdf1e44fbcaa99dbc2480985e 100644 (file)
--- a/Documentation/virt/kvm/ppc-pv.txt
+++ b/Documentation/virt/kvm/ppc-pv.rst
@@ -1,3 +1,6 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=================================
  The PPC KVM paravirtual interface
  =================================
  
@@ -34,8 +37,9 @@ up the hypercall. To call a hypercall, just call these instructions.
  
  The parameters are as follows:
  
+        ========       ================        ================
         Register        IN                      OUT
-
+        ========       ================        ================
         r0              -                       volatile
         r3              1st parameter           Return code
         r4              2nd parameter           1st output value
@@ -47,6 +51,7 @@ The parameters are as follows:
         r10             8th parameter           7th output value
         r11             hypercall number        8th output value
         r12             -                       volatile
+        ========       ================        ================
  
  Hypercall definitions are shared in generic code, so the same hypercall numbers
  apply for x86 and powerpc alike with the exception that each KVM hypercall
@@ -54,11 +59,13 @@ also needs to be ORed with the KVM vendor code which is (42 << 16).
  
  Return codes can be as follows:
  
+       ====            =========================
         Code            Meaning
-
+       ====            =========================
         0               Success
         12              Hypercall not implemented
         <0              Error
+       ====            =========================
  
  The magic page
  ==============
@@ -72,7 +79,7 @@ desired location. The first parameter indicates the effective address when the
  MMU is enabled. The second parameter indicates the address in real mode, if
  applicable to the target. For now, we always map the page to -4096. This way we
  can access it using absolute load and store functions. The following
-instruction reads the first field of the magic page:
+instruction reads the first field of the magic page::
  
         ld      rX, -4096(0)
  
@@ -93,8 +100,10 @@ a bitmap of available features inside the magic page.
  
  The following enhancements to the magic page are currently available:
  
+  ============================  =======================================
    KVM_MAGIC_FEAT_SR            Maps SR registers r/w in the magic page
    KVM_MAGIC_FEAT_MAS0_TO_SPRG7 Maps MASn, ESR, PIR and high SPRGs
+  ============================  =======================================
  
  For enhanced features in the magic page, please check for the existence of the
  feature before using them!
@@ -121,8 +130,8 @@ when entering the guest or don't have any impact on the hypervisor's behavior.
  
  The following bits are safe to be set inside the guest:
  
-  MSR_EE
-  MSR_RI
+  - MSR_EE
+  - MSR_RI
  
  If any other bit changes in the MSR, please still use mtmsr(d).
  
@@ -138,9 +147,9 @@ guest. Implementing any of those mappings is optional, as the instruction traps
  also act on the shared page. So calling privileged instructions still works as
  before.
  
+======================= ================================
  From                   To
-====                   ==
-
+======================= ================================
  mfmsr  rX              ld      rX, magic_page->msr
  mfsprg rX, 0           ld      rX, magic_page->sprg0
  mfsprg rX, 1           ld      rX, magic_page->sprg1
@@ -173,7 +182,7 @@ mtsrin      rX, rY          b       <special mtsrin section>
  
  [BookE only]
  wrteei [0|1]           b       <special wrteei section>
-
+======================= ================================
  
  Some instructions require more logic to determine what's going on than a load
  or store instruction can deliver. To enable patching of those, we keep some
@@ -191,6 +200,7 @@ for example.
  
  Hypercall ABIs in KVM on PowerPC
  =================================
+
  1) KVM hypercalls (ePAPR)
  
  These are ePAPR compliant hypercall implementation (mentioned above). Even
diff --git a/Documentation/virt/kvm/review-checklist.txt b/Documentation/virt/kvm/review-checklist.rst

similarity index 95%

rename from Documentation/virt/kvm/review-checklist.txt

rename to Documentation/virt/kvm/review-checklist.rst

index 499af499e296fcbdf78e6d65651246a8dc33d10e..1f86a9d3f7057117e30e14add987a5faa8a3f2a0 100644 (file)
--- a/Documentation/virt/kvm/review-checklist.txt
+++ b/Documentation/virt/kvm/review-checklist.rst
@@ -1,3 +1,6 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+================================
  Review checklist for kvm patches
  ================================
  
diff --git a/Documentation/virt/kvm/s390-diag.txt b/Documentation/virt/kvm/s390-diag.rst

similarity index 90%

rename from Documentation/virt/kvm/s390-diag.txt

rename to Documentation/virt/kvm/s390-diag.rst

index 7c52e5f8b210357075cc115500638c4b5841bab9..eaac4864d3d62e1c1f35ab7a2518ebf6f293d2d1 100644 (file)
--- a/Documentation/virt/kvm/s390-diag.txt
+++ b/Documentation/virt/kvm/s390-diag.rst
@@ -1,3 +1,6 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=============================
  The s390 DIAGNOSE call on KVM
  =============================
  
@@ -16,12 +19,12 @@ DIAGNOSE calls by the guest cause a mandatory intercept. This implies
  all supported DIAGNOSE calls need to be handled by either KVM or its
  userspace.
  
-All DIAGNOSE calls supported by KVM use the RS-a format:
+All DIAGNOSE calls supported by KVM use the RS-a format::
  
---------------------------------------
-|  '83'  | R1 | R3 | B2 |     D2     |
---------------------------------------
-0        8    12   16   20           31
+  --------------------------------------
+  |  '83'  | R1 | R3 | B2 |     D2     |
+  --------------------------------------
+  0        8    12   16   20           31
  
  The second-operand address (obtained by the base/displacement calculation)
  is not used to address data. Instead, bits 48-63 of this address specify
diff --git a/Documentation/virt/kvm/timekeeping.txt b/Documentation/virt/kvm/timekeeping.rst

similarity index 85%

rename from Documentation/virt/kvm/timekeeping.txt

rename to Documentation/virt/kvm/timekeeping.rst

index 76808a17ad84f108f0e895e2aa6ea1bf8d63d500..21ae7efa29ba19bdb6f0a67208e02931cc5f15fc 100644 (file)
--- a/Documentation/virt/kvm/timekeeping.txt
+++ b/Documentation/virt/kvm/timekeeping.rst
@@ -1,17 +1,21 @@
+.. SPDX-License-Identifier: GPL-2.0
  
-       Timekeeping Virtualization for X86-Based Architectures
+======================================================
+Timekeeping Virtualization for X86-Based Architectures
+======================================================
  
-       Zachary Amsden <zamsden@redhat.com>
-       Copyright (c) 2010, Red Hat.  All rights reserved.
+:Author: Zachary Amsden <zamsden@redhat.com>
+:Copyright: (c) 2010, Red Hat.  All rights reserved.
  
-1) Overview
-2) Timing Devices
-3) TSC Hardware
-4) Virtualization Problems
+.. Contents
  
-=========================================================================
+   1) Overview
+   2) Timing Devices
+   3) TSC Hardware
+   4) Virtualization Problems
  
-1) Overview
+1. Overview
+===========
  
  One of the most complicated parts of the X86 platform, and specifically,
  the virtualization of this platform is the plethora of timing devices available
@@ -27,15 +31,15 @@ The purpose of this document is to collect data and information relevant to
  timekeeping which may be difficult to find elsewhere, specifically,
  information relevant to KVM and hardware-based virtualization.
  
-=========================================================================
-
-2) Timing Devices
+2. Timing Devices
+=================
  
  First we discuss the basic hardware devices available.  TSC and the related
  KVM clock are special enough to warrant a full exposition and are described in
  the following section.
  
-2.1) i8254 - PIT
+2.1. i8254 - PIT
+----------------
  
  One of the first timer devices available is the programmable interrupt timer,
  or PIT.  The PIT has a fixed frequency 1.193182 MHz base clock and three
@@ -50,13 +54,13 @@ The PIT uses I/O ports 0x40 - 0x43.  Access to the 16-bit counters is done
  using single or multiple byte access to the I/O ports.  There are 6 modes
  available, but not all modes are available to all timers, as only timer 2
  has a connected gate input, required for modes 1 and 5.  The gate line is
-controlled by port 61h, bit 0, as illustrated in the following diagram.
+controlled by port 61h, bit 0, as illustrated in the following diagram::
  
- --------------             ----------------
-|              |           |                |
-|  1.1932 MHz  |---------->| CLOCK      OUT | ---------> IRQ 0
-|    Clock     |   |       |                |
- --------------    |    +->| GATE  TIMER 0  |
+  --------------             ----------------
+  |            |           |                |
+  |  1.1932 MHz|---------->| CLOCK      OUT | ---------> IRQ 0
+  |    Clock   |   |       |                |
+  --------------   |    +->| GATE  TIMER 0  |
                     |        ----------------
                     |
                     |        ----------------
@@ -70,29 +74,33 @@ controlled by port 61h, bit 0, as illustrated in the following diagram.
                     |       |                |
                     |------>| CLOCK      OUT | ---------> Port 61h, bit 5
                             |                |      |
-Port 61h, bit 0 ---------->| GATE  TIMER 2  |       \_.----   ____
+  Port 61h, bit 0 -------->| GATE  TIMER 2  |       \_.----   ____
                              ----------------         _|    )--|LPF|---Speaker
                                                      / *----   \___/
-Port 61h, bit 1 -----------------------------------/
+  Port 61h, bit 1 ---------------------------------/
  
  The timer modes are now described.
  
-Mode 0: Single Timeout.   This is a one-shot software timeout that counts down
+Mode 0: Single Timeout.
+ This is a one-shot software timeout that counts down
   when the gate is high (always true for timers 0 and 1).  When the count
   reaches zero, the output goes high.
  
-Mode 1: Triggered One-shot.  The output is initially set high.  When the gate
+Mode 1: Triggered One-shot.
+ The output is initially set high.  When the gate
   line is set high, a countdown is initiated (which does not stop if the gate is
   lowered), during which the output is set low.  When the count reaches zero,
   the output goes high.
  
-Mode 2: Rate Generator.  The output is initially set high.  When the countdown
+Mode 2: Rate Generator.
+ The output is initially set high.  When the countdown
   reaches 1, the output goes low for one count and then returns high.  The value
   is reloaded and the countdown automatically resumes.  If the gate line goes
   low, the count is halted.  If the output is low when the gate is lowered, the
   output automatically goes high (this only affects timer 2).
  
-Mode 3: Square Wave.   This generates a high / low square wave.  The count
+Mode 3: Square Wave.
+ This generates a high / low square wave.  The count
   determines the length of the pulse, which alternates between high and low
   when zero is reached.  The count only proceeds when gate is high and is
   automatically reloaded on reaching zero.  The count is decremented twice at
@@ -103,12 +111,14 @@ Mode 3: Square Wave.   This generates a high / low square wave.  The count
   values are not observed when reading.  This is the intended mode for timer 2,
   which generates sine-like tones by low-pass filtering the square wave output.
  
-Mode 4: Software Strobe.  After programming this mode and loading the counter,
+Mode 4: Software Strobe.
+ After programming this mode and loading the counter,
   the output remains high until the counter reaches zero.  Then the output
   goes low for 1 clock cycle and returns high.  The counter is not reloaded.
   Counting only occurs when gate is high.
  
-Mode 5: Hardware Strobe.  After programming and loading the counter, the
+Mode 5: Hardware Strobe.
+ After programming and loading the counter, the
   output remains high.  When the gate is raised, a countdown is initiated
   (which does not stop if the gate is lowered).  When the counter reaches zero,
   the output goes low for 1 clock cycle and then returns high.  The counter is
@@ -118,49 +128,49 @@ In addition to normal binary counting, the PIT supports BCD counting.  The
  command port, 0x43 is used to set the counter and mode for each of the three
  timers.
  
-PIT commands, issued to port 0x43, using the following bit encoding:
+PIT commands, issued to port 0x43, using the following bit encoding::
  
-Bit 7-4: Command (See table below)
-Bit 3-1: Mode (000 = Mode 0, 101 = Mode 5, 11X = undefined)
-Bit 0  : Binary (0) / BCD (1)
+  Bit 7-4: Command (See table below)
+  Bit 3-1: Mode (000 = Mode 0, 101 = Mode 5, 11X = undefined)
+  Bit 0  : Binary (0) / BCD (1)
  
-Command table:
+Command table::
  
-0000 - Latch Timer 0 count for port 0x40
+  0000 - Latch Timer 0 count for port 0x40
         sample and hold the count to be read in port 0x40;
         additional commands ignored until counter is read;
         mode bits ignored.
  
-0001 - Set Timer 0 LSB mode for port 0x40
+  0001 - Set Timer 0 LSB mode for port 0x40
         set timer to read LSB only and force MSB to zero;
         mode bits set timer mode
  
-0010 - Set Timer 0 MSB mode for port 0x40
+  0010 - Set Timer 0 MSB mode for port 0x40
         set timer to read MSB only and force LSB to zero;
         mode bits set timer mode
  
-0011 - Set Timer 0 16-bit mode for port 0x40
+  0011 - Set Timer 0 16-bit mode for port 0x40
         set timer to read / write LSB first, then MSB;
         mode bits set timer mode
  
-0100 - Latch Timer 1 count for port 0x41 - as described above
-0101 - Set Timer 1 LSB mode for port 0x41 - as described above
-0110 - Set Timer 1 MSB mode for port 0x41 - as described above
-0111 - Set Timer 1 16-bit mode for port 0x41 - as described above
+  0100 - Latch Timer 1 count for port 0x41 - as described above
+  0101 - Set Timer 1 LSB mode for port 0x41 - as described above
+  0110 - Set Timer 1 MSB mode for port 0x41 - as described above
+  0111 - Set Timer 1 16-bit mode for port 0x41 - as described above
  
-1000 - Latch Timer 2 count for port 0x42 - as described above
-1001 - Set Timer 2 LSB mode for port 0x42 - as described above
-1010 - Set Timer 2 MSB mode for port 0x42 - as described above
-1011 - Set Timer 2 16-bit mode for port 0x42 as described above
+  1000 - Latch Timer 2 count for port 0x42 - as described above
+  1001 - Set Timer 2 LSB mode for port 0x42 - as described above
+  1010 - Set Timer 2 MSB mode for port 0x42 - as described above
+  1011 - Set Timer 2 16-bit mode for port 0x42 as described above
  
-1101 - General counter latch
+  1101 - General counter latch
         Latch combination of counters into corresponding ports
         Bit 3 = Counter 2
         Bit 2 = Counter 1
         Bit 1 = Counter 0
         Bit 0 = Unused
  
-1110 - Latch timer status
+  1110 - Latch timer status
         Latch combination of counter mode into corresponding ports
         Bit 3 = Counter 2
         Bit 2 = Counter 1
@@ -177,7 +187,8 @@ Command table:
         Bit 3-1 = Mode
         Bit 0 = Binary (0) / BCD mode (1)
  
-2.2) RTC
+2.2. RTC
+--------
  
  The second device which was available in the original PC was the MC146818 real
  time clock.  The original device is now obsolete, and usually emulated by the
@@ -201,21 +212,21 @@ in progress, as indicated in the status register.
  The clock uses a 32.768kHz crystal, so bits 6-4 of register A should be
  programmed to a 32kHz divider if the RTC is to count seconds.
  
-This is the RAM map originally used for the RTC/CMOS:
-
-Location    Size    Description
-------------------------------------------
-00h         byte    Current second (BCD)
-01h         byte    Seconds alarm (BCD)
-02h         byte    Current minute (BCD)
-03h         byte    Minutes alarm (BCD)
-04h         byte    Current hour (BCD)
-05h         byte    Hours alarm (BCD)
-06h         byte    Current day of week (BCD)
-07h         byte    Current day of month (BCD)
-08h         byte    Current month (BCD)
-09h         byte    Current year (BCD)
-0Ah         byte    Register A
+This is the RAM map originally used for the RTC/CMOS::
+
+  Location    Size    Description
+  ------------------------------------------
+  00h         byte    Current second (BCD)
+  01h         byte    Seconds alarm (BCD)
+  02h         byte    Current minute (BCD)
+  03h         byte    Minutes alarm (BCD)
+  04h         byte    Current hour (BCD)
+  05h         byte    Hours alarm (BCD)
+  06h         byte    Current day of week (BCD)
+  07h         byte    Current day of month (BCD)
+  08h         byte    Current month (BCD)
+  09h         byte    Current year (BCD)
+  0Ah         byte    Register A
                         bit 7   = Update in progress
                         bit 6-4 = Divider for clock
                                    000 = 4.194 MHz
@@ -234,7 +245,7 @@ Location    Size    Description
                                   1101 = 125 mS
                                   1110 = 250 mS
                                   1111 = 500 mS
-0Bh         byte    Register B
+  0Bh         byte    Register B
                         bit 7   = Run (0) / Halt (1)
                         bit 6   = Periodic interrupt enable
                         bit 5   = Alarm interrupt enable
@@ -243,19 +254,20 @@ Location    Size    Description
                         bit 2   = BCD calendar (0) / Binary (1)
                         bit 1   = 12-hour mode (0) / 24-hour mode (1)
                         bit 0   = 0 (DST off) / 1 (DST enabled)
-OCh         byte    Register C (read only)
+  OCh         byte    Register C (read only)
                         bit 7   = interrupt request flag (IRQF)
                         bit 6   = periodic interrupt flag (PF)
                         bit 5   = alarm interrupt flag (AF)
                         bit 4   = update interrupt flag (UF)
                         bit 3-0 = reserved
-ODh         byte    Register D (read only)
+  ODh         byte    Register D (read only)
                         bit 7   = RTC has power
                         bit 6-0 = reserved
-32h         byte    Current century BCD (*)
+  32h         byte    Current century BCD (*)
    (*) location vendor specific and now determined from ACPI global tables
  
-2.3) APIC
+2.3. APIC
+---------
  
  On Pentium and later processors, an on-board timer is available to each CPU
  as part of the Advanced Programmable Interrupt Controller.  The APIC is
@@ -276,7 +288,8 @@ timer is programmed through the LVT (local vector timer) register, is capable
  of one-shot or periodic operation, and is based on the bus clock divided down
  by the programmable divider register.
  
-2.4) HPET
+2.4. HPET
+---------
  
  HPET is quite complex, and was originally intended to replace the PIT / RTC
  support of the X86 PC.  It remains to be seen whether that will be the case, as
@@ -297,7 +310,8 @@ indicated through ACPI tables by the BIOS.
  Detailed specification of the HPET is beyond the current scope of this
  document, as it is also very well documented elsewhere.
  
-2.5) Offboard Timers
+2.5. Offboard Timers
+--------------------
  
  Several cards, both proprietary (watchdog boards) and commonplace (e1000) have
  timing chips built into the cards which may have registers which are accessible
@@ -307,9 +321,8 @@ general frowned upon as not playing by the agreed rules of the game.  Such a
  timer device would require additional support to be virtualized properly and is
  not considered important at this time as no known operating system does this.
  
-=========================================================================
-
-3) TSC Hardware
+3. TSC Hardware
+===============
  
  The TSC or time stamp counter is relatively simple in theory; it counts
  instruction cycles issued by the processor, which can be used as a measure of
@@ -340,7 +353,8 @@ allows the guest visible TSC to be offset by a constant.  Newer implementations
  promise to allow the TSC to additionally be scaled, but this hardware is not
  yet widely available.
  
-3.1) TSC synchronization
+3.1. TSC synchronization
+------------------------
  
  The TSC is a CPU-local clock in most implementations.  This means, on SMP
  platforms, the TSCs of different CPUs may start at different times depending
@@ -357,7 +371,8 @@ practice, getting a perfectly synchronized TSC will not be possible unless all
  values are read from the same clock, which generally only is possible on single
  socket systems or those with special hardware support.
  
-3.2) TSC and CPU hotplug
+3.2. TSC and CPU hotplug
+------------------------
  
  As touched on already, CPUs which arrive later than the boot time of the system
  may not have a TSC value that is synchronized with the rest of the system.
@@ -367,7 +382,8 @@ a guarantee.  This can have the effect of bringing a system from a state where
  TSC is synchronized back to a state where TSC synchronization flaws, however
  small, may be exposed to the OS and any virtualization environment.
  
-3.3) TSC and multi-socket / NUMA
+3.3. TSC and multi-socket / NUMA
+--------------------------------
  
  Multi-socket systems, especially large multi-socket systems are likely to have
  individual clocksources rather than a single, universally distributed clock.
@@ -385,7 +401,8 @@ standards for telecommunications and computer equipment.
  It is recommended not to trust the TSCs to remain synchronized on NUMA or
  multiple socket systems for these reasons.
  
-3.4) TSC and C-states
+3.4. TSC and C-states
+---------------------
  
  C-states, or idling states of the processor, especially C1E and deeper sleep
  states may be problematic for TSC as well.  The TSC may stop advancing in such
@@ -396,7 +413,8 @@ based on CPU and chipset identifications.
  The TSC in such a case may be corrected by catching it up to a known external
  clocksource.
  
-3.5) TSC frequency change / P-states
+3.5. TSC frequency change / P-states
+------------------------------------
  
  To make things slightly more interesting, some CPUs may change frequency.  They
  may or may not run the TSC at the same rate, and because the frequency change
@@ -416,14 +434,16 @@ other processors.  In such cases, the TSC on halted CPUs could advance faster
  than that of non-halted processors.  AMD Turion processors are known to have
  this problem.
  
-3.6) TSC and STPCLK / T-states
+3.6. TSC and STPCLK / T-states
+------------------------------
  
  External signals given to the processor may also have the effect of stopping
  the TSC.  This is typically done for thermal emergency power control to prevent
  an overheating condition, and typically, there is no way to detect that this
  condition has happened.
  
-3.7) TSC virtualization - VMX
+3.7. TSC virtualization - VMX
+-----------------------------
  
  VMX provides conditional trapping of RDTSC, RDMSR, WRMSR and RDTSCP
  instructions, which is enough for full virtualization of TSC in any manner.  In
@@ -431,14 +451,16 @@ addition, VMX allows passing through the host TSC plus an additional TSC_OFFSET
  field specified in the VMCS.  Special instructions must be used to read and
  write the VMCS field.
  
-3.8) TSC virtualization - SVM
+3.8. TSC virtualization - SVM
+-----------------------------
  
  SVM provides conditional trapping of RDTSC, RDMSR, WRMSR and RDTSCP
  instructions, which is enough for full virtualization of TSC in any manner.  In
  addition, SVM allows passing through the host TSC plus an additional offset
  field specified in the SVM control block.
  
-3.9) TSC feature bits in Linux
+3.9. TSC feature bits in Linux
+------------------------------
  
  In summary, there is no way to guarantee the TSC remains in perfect
  synchronization unless it is explicitly guaranteed by the architecture.  Even
@@ -448,13 +470,16 @@ despite being locally consistent.
  The following feature bits are used by Linux to signal various TSC attributes,
  but they can only be taken to be meaningful for UP or single node systems.
  
-X86_FEATURE_TSC                : The TSC is available in hardware
-X86_FEATURE_RDTSCP             : The RDTSCP instruction is available
-X86_FEATURE_CONSTANT_TSC       : The TSC rate is unchanged with P-states
-X86_FEATURE_NONSTOP_TSC                : The TSC does not stop in C-states
-X86_FEATURE_TSC_RELIABLE       : TSC sync checks are skipped (VMware)
+=========================      =======================================
+X86_FEATURE_TSC                        The TSC is available in hardware
+X86_FEATURE_RDTSCP             The RDTSCP instruction is available
+X86_FEATURE_CONSTANT_TSC       The TSC rate is unchanged with P-states
+X86_FEATURE_NONSTOP_TSC                The TSC does not stop in C-states
+X86_FEATURE_TSC_RELIABLE       TSC sync checks are skipped (VMware)
+=========================      =======================================
  
-4) Virtualization Problems
+4. Virtualization Problems
+==========================
  
  Timekeeping is especially problematic for virtualization because a number of
  challenges arise.  The most obvious problem is that time is now shared between
@@ -473,7 +498,8 @@ BIOS, but not in such an extreme fashion.  However, the fact that SMM mode may
  cause similar problems to virtualization makes it a good justification for
  solving many of these problems on bare metal.
  
-4.1) Interrupt clocking
+4.1. Interrupt clocking
+-----------------------
  
  One of the most immediate problems that occurs with legacy operating systems
  is that the system timekeeping routines are often designed to keep track of
@@ -502,7 +528,8 @@ thus requires interrupt slewing to keep proper time.  It does use a low enough
  rate (ed: is it 18.2 Hz?) however that it has not yet been a problem in
  practice.
  
-4.2) TSC sampling and serialization
+4.2. TSC sampling and serialization
+-----------------------------------
  
  As the highest precision time source available, the cycle counter of the CPU
  has aroused much interest from developers.  As explained above, this timer has
@@ -524,7 +551,8 @@ it may be necessary for an implementation to guard against "backwards" reads of
  the TSC as seen from other CPUs, even in an otherwise perfectly synchronized
  system.
  
-4.3) Timespec aliasing
+4.3. Timespec aliasing
+----------------------
  
  Additionally, this lack of serialization from the TSC poses another challenge
  when using results of the TSC when measured against another time source.  As
@@ -548,7 +576,8 @@ This aliasing requires care in the computation and recalibration of kvmclock
  and any other values derived from TSC computation (such as TSC virtualization
  itself).
  
-4.4) Migration
+4.4. Migration
+--------------
  
  Migration of a virtual machine raises problems for timekeeping in two ways.
  First, the migration itself may take time, during which interrupts cannot be
@@ -566,7 +595,8 @@ always be caught up to the original rate.  KVM clock avoids these problems by
  simply storing multipliers and offsets against the TSC for the guest to convert
  back into nanosecond resolution values.
  
-4.5) Scheduling
+4.5. Scheduling
+---------------
  
  Since scheduling may be based on precise timing and firing of interrupts, the
  scheduling algorithms of an operating system may be adversely affected by
@@ -579,7 +609,8 @@ In an attempt to work around this, several implementations have provided a
  paravirtualized scheduler clock, which reveals the true amount of CPU time for
  which a virtual machine has been running.
  
-4.6) Watchdogs
+4.6. Watchdogs
+--------------
  
  Watchdog timers, such as the lock detector in Linux may fire accidentally when
  running under hardware virtualization due to timer interrupts being delayed or
@@ -587,7 +618,8 @@ misinterpretation of the passage of real time.  Usually, these warnings are
  spurious and can be ignored, but in some circumstances it may be necessary to
  disable such detection.
  
-4.7) Delays and precision timing
+4.7. Delays and precision timing
+--------------------------------
  
  Precise timing and delays may not be possible in a virtualized system.  This
  can happen if the system is controlling physical hardware, or issues delays to
@@ -600,7 +632,8 @@ The second issue may cause performance problems, but this is unlikely to be a
  significant issue.  In many cases these delays may be eliminated through
  configuration or paravirtualization.
  
-4.8) Covert channels and leaks
+4.8. Covert channels and leaks
+------------------------------
  
  In addition to the above problems, time information will inevitably leak to the
  guest about the host in anything but a perfect implementation of virtualized
diff --git a/Documentation/virt/uml/UserModeLinux-HOWTO.txt b/Documentation/virt/uml/user_mode_linux.rst

similarity index 74%

rename from Documentation/virt/uml/UserModeLinux-HOWTO.txt

rename to Documentation/virt/uml/user_mode_linux.rst

index 87b80f589e1c0163c68365b4a67d623c3563dbc9..de0f0b2c9d5bd5fac524941b8c433f0dc2a2efec 100644 (file)
--- a/Documentation/virt/uml/UserModeLinux-HOWTO.txt
+++ b/Documentation/virt/uml/user_mode_linux.rst
@@ -1,12 +1,17 @@
-  User Mode Linux HOWTO
-  User Mode Linux Core Team
-  Mon Nov 18 14:16:16 EST 2002
+.. SPDX-License-Identifier: GPL-2.0
  
-  This document describes the use and abuse of Jeff Dike's User Mode
-  Linux: a port of the Linux kernel as a normal Intel Linux process.
-  ______________________________________________________________________
+=====================
+User Mode Linux HOWTO
+=====================
  
-  Table of Contents
+:Author:  User Mode Linux Core Team
+:Last-updated: Sat Jan 25 16:07:55 CET 2020
+
+This document describes the use and abuse of Jeff Dike's User Mode
+Linux: a port of the Linux kernel as a normal Intel Linux process.
+
+
+.. Table of Contents
  
    1. Introduction
  
@@ -132,19 +137,19 @@
       15.5 Other contributions
  
  
-  ______________________________________________________________________
-
-  1.  Introduction
+1.  Introduction
+================
  
    Welcome to User Mode Linux.  It's going to be fun.
  
  
  
-  1.1.  How is User Mode Linux Different?
+1.1.  How is User Mode Linux Different?
+---------------------------------------
  
    Normally, the Linux Kernel talks straight to your hardware (video
    card, keyboard, hard drives, etc), and any programs which run ask the
-  kernel to operate the hardware, like so:
+  kernel to operate the hardware, like so::
  
  
  
@@ -160,10 +165,10 @@
  
  
    The User Mode Linux Kernel is different; instead of talking to the
-  hardware, it talks to a `real' Linux kernel (called the `host kernel'
+  hardware, it talks to a `real` Linux kernel (called the `host kernel`
    from now on), like any other program.  Programs can then run inside
    User-Mode Linux as if they were running under a normal kernel, like
-  so:
+  so::
  
  
  
@@ -181,7 +186,8 @@
  
  
  
-  1.2.  Why Would I Want User Mode Linux?
+1.2.  Why Would I Want User Mode Linux?
+---------------------------------------
  
  
    1. If User Mode Linux crashes, your host kernel is still fine.
@@ -204,83 +210,41 @@
  
  
  
+.. _Compiling_the_kernel_and_modules:
  
-
-  2.  Compiling the kernel and modules
+2.  Compiling the kernel and modules
+====================================
  
  
  
  
-  2.1.  Compiling the kernel
+2.1.  Compiling the kernel
+--------------------------
  
  
    Compiling the user mode kernel is just like compiling any other
-  kernel.  Let's go through the steps, using 2.4.0-prerelease (current
-  as of this writing) as an example:
-
-
-  1. Download the latest UML patch from
-
-     the download page <http://user-mode-linux.sourceforge.net/
-
-     In this example, the file is uml-patch-2.4.0-prerelease.bz2.
+  kernel.
  
  
-  2. Download the matching kernel from your favourite kernel mirror,
+  1. Download the latest kernel from your favourite kernel mirror,
       such as:
  
-     ftp://ftp.ca.kernel.org/pub/kernel/v2.4/linux-2.4.0-prerelease.tar.bz2
-     <ftp://ftp.ca.kernel.org/pub/kernel/v2.4/linux-2.4.0-prerelease.tar.bz2>
-     .
-
-
-  3. Make a directory and unpack the kernel into it.
-
+     https://mirrors.edge.kernel.org/pub/linux/kernel/v5.x/linux-5.4.14.tar.xz
  
+  2. Make a directory and unpack the kernel into it::
  
         host%
         mkdir ~/uml
  
-
-
-
-
-
         host%
         cd ~/uml
  
-
-
-
-
-
-       host%
-       tar -xzvf linux-2.4.0-prerelease.tar.bz2
-
-
-
-
-
-
-  4. Apply the patch using
-
-
-
-       host%
-       cd ~/uml/linux
-
-
-
         host%
-       bzcat uml-patch-2.4.0-prerelease.bz2 | patch -p1
+       tar xvf linux-5.4.14.tar.xz
  
  
-
-
-
-
-  5. Run your favorite config; `make xconfig ARCH=um' is the most
-     convenient.  `make config ARCH=um' and 'make menuconfig ARCH=um'
+  3. Run your favorite config; ``make xconfig ARCH=um`` is the most
+     convenient.  ``make config ARCH=um`` and ``make menuconfig ARCH=um``
       will work as well.  The defaults will give you a useful kernel.  If
       you want to change something, go ahead, it probably won't hurt
       anything.
@@ -288,44 +252,20 @@
  
       Note:  If the host is configured with a 2G/2G address space split
       rather than the usual 3G/1G split, then the packaged UML binaries
-     will not run.  They will immediately segfault.  See ``UML on 2G/2G
-     hosts''  for the scoop on running UML on your system.
-
-
-
-  6. Finish with `make linux ARCH=um': the result is a file called
-     `linux' in the top directory of your source tree.
-
-  Make sure that you don't build this kernel in /usr/src/linux.  On some
-  distributions, /usr/include/asm is a link into this pool.  The user-
-  mode build changes the other end of that link, and things that include
-  <asm/anything.h> stop compiling.
-
-  The sources are also available from cvs at the project's cvs page,
-  which has directions on getting the sources. You can also browse the
-  CVS pool from there.
+     will not run.  They will immediately segfault.  See
+     :ref:`UML_on_2G/2G_hosts`  for the scoop on running UML on your system.
  
-  If you get the CVS sources, you will have to check them out into an
-  empty directory. You will then have to copy each file into the
-  corresponding directory in the appropriate kernel pool.
  
-  If you don't have the latest kernel pool, you can get the
-  corresponding user-mode sources with
  
+  4. Finish with ``make linux ARCH=um``: the result is a file called
+     ``linux`` in the top directory of your source tree.
  
-       host% cvs co -r v_2_3_x linux
  
-
-
-
-  where 'x' is the version in your pool. Note that you will not get the
-  bug fixes and enhancements that have gone into subsequent releases.
-
-
-  2.2.  Compiling and installing kernel modules
+2.2.  Compiling and installing kernel modules
+---------------------------------------------
  
    UML modules are built in the same way as the native kernel (with the
-  exception of the 'ARCH=um' that you always need for UML):
+  exception of the 'ARCH=um' that you always need for UML)::
  
  
         host% make modules ARCH=um
@@ -337,12 +277,12 @@
    the user-mode pool.  Modules from the native kernel won't work.
  
    You can install them by using ftp or something to copy them into the
-  virtual machine and dropping them into /lib/modules/`uname -r`.
+  virtual machine and dropping them into ``/lib/modules/$(uname -r)``.
  
    You can also get the kernel build process to install them as follows:
  
    1. with the kernel not booted, mount the root filesystem in the top
-     level of the kernel pool:
+     level of the kernel pool::
  
  
         host% mount root_fs mnt -o loop
@@ -352,7 +292,7 @@
  
  
  
-  2. run
+  2. run::
  
  
         host%
@@ -363,7 +303,7 @@
  
  
  
-  3. unmount the filesystem
+  3. unmount the filesystem::
  
  
         host% umount mnt
@@ -381,27 +321,28 @@
    as modules, especially filesystems and network protocols and filters,
    so most symbols which need to be exported probably already are.
    However, if you do find symbols that need exporting, let  us
-  <http://user-mode-linux.sourceforge.net/>  know, and
+  know at http://user-mode-linux.sourceforge.net/, and
    they'll be "taken care of".
  
  
  
-  2.3.  Compiling and installing uml_utilities
+2.3.  Compiling and installing uml_utilities
+--------------------------------------------
  
    Many features of the UML kernel require a user-space helper program,
    so a uml_utilities package is distributed separately from the kernel
    patch which provides these helpers. Included within this is:
  
-  o  port-helper - Used by consoles which connect to xterms or ports
+  -  port-helper - Used by consoles which connect to xterms or ports
  
-  o  tunctl - Configuration tool to create and delete tap devices
+  -  tunctl - Configuration tool to create and delete tap devices
  
-  o  uml_net - Setuid binary for automatic tap device configuration
+  -  uml_net - Setuid binary for automatic tap device configuration
  
-  o  uml_switch - User-space virtual switch required for daemon
+  -  uml_switch - User-space virtual switch required for daemon
       transport
  
-     The uml_utilities tree is compiled with:
+     The uml_utilities tree is compiled with::
  
  
         host#
@@ -423,38 +364,42 @@
  
  
  
-  3.  Running UML and logging in
+3.  Running UML and logging in
+==============================
  
  
  
-  3.1.  Running UML
+3.1.  Running UML
+-----------------
  
-  It runs on 2.2.15 or later, and all 2.4 kernels.
+  It runs on 2.2.15 or later, and all kernel versions since 2.4.
  
  
    Booting UML is straightforward.  Simply run 'linux': it will try to
-  mount the file `root_fs' in the current directory.  You do not need to
-  run it as root.  If your root filesystem is not named `root_fs', then
-  you need to put a `ubd0=root_fs_whatever' switch on the linux command
+  mount the file ``root_fs`` in the current directory.  You do not need to
+  run it as root.  If your root filesystem is not named ``root_fs``, then
+  you need to put a ``ubd0=root_fs_whatever`` switch on the linux command
    line.
  
  
    You will need a filesystem to boot UML from.  There are a number
-  available for download from  here  <http://user-mode-
-  linux.sourceforge.net/> .  There are also  several tools
-  <http://user-mode-linux.sourceforge.net/>  which can be
+  available for download from http://user-mode-linux.sourceforge.net.
+  There are also  several tools at
+  http://user-mode-linux.sourceforge.net/  which can be
    used to generate UML-compatible filesystem images from media.
    The kernel will boot up and present you with a login prompt.
  
  
-  Note:  If the host is configured with a 2G/2G address space split
+Note:
+  If the host is configured with a 2G/2G address space split
    rather than the usual 3G/1G split, then the packaged UML binaries will
-  not run.  They will immediately segfault.  See ``UML on 2G/2G hosts''
+  not run.  They will immediately segfault.  See :ref:`UML_on_2G/2G_hosts`
    for the scoop on running UML on your system.
  
  
  
-  3.2.  Logging in
+3.2.  Logging in
+----------------
  
  
  
@@ -468,22 +413,22 @@
  
    There are a couple of other ways to log in:
  
-  o  On a virtual console
+  -  On a virtual console
  
  
  
       Each virtual console that is configured (i.e. the device exists in
       /dev and /etc/inittab runs a getty on it) will come up in its own
-     xterm.  If you get tired of the xterms, read ``Setting up serial
-     lines and consoles''  to see how to attach the consoles to
-     something else, like host ptys.
+     xterm.  If you get tired of the xterms, read
+     :ref:`setting_up_serial_lines_and_consoles` to see how to attach
+     the consoles to something else, like host ptys.
  
  
  
-  o  Over the serial line
+  -  Over the serial line
  
  
-     In the boot output, find a line that looks like:
+     In the boot output, find a line that looks like::
  
  
  
@@ -493,7 +438,7 @@
  
  
    Attach your favorite terminal program to the corresponding tty.  I.e.
-  for minicom, the command would be
+  for minicom, the command would be::
  
  
         host% minicom -o -p /dev/ttyp1
@@ -503,37 +448,40 @@
  
  
  
-  o  Over the net
+  -  Over the net
  
  
       If the network is running, then you can telnet to the virtual
-     machine and log in to it.  See ``Setting up the network''  to learn
+     machine and log in to it.  See :ref:`Setting_up_the_network`  to learn
       about setting up a virtual network.
  
    When you're done using it, run halt, and the kernel will bring itself
    down and the process will exit.
  
  
-  3.3.  Examples
+3.3.  Examples
+--------------
  
    Here are some examples of UML in action:
  
-  o  A login session <http://user-mode-linux.sourceforge.net/login.html>
+  -  A login session http://user-mode-linux.sourceforge.net/old/login.html
  
-  o  A virtual network <http://user-mode-linux.sourceforge.net/net.html>
+  -  A virtual network http://user-mode-linux.sourceforge.net/old/net.html
  
  
  
  
  
+.. _UML_on_2G/2G_hosts:
  
+4.  UML on 2G/2G hosts
+======================
  
-  4.  UML on 2G/2G hosts
  
  
  
-
-  4.1.  Introduction
+4.1.  Introduction
+------------------
  
  
    Most Linux machines are configured so that the kernel occupies the
@@ -546,7 +494,8 @@
  
  
  
-  4.2.  The problem
+4.2.  The problem
+-----------------
  
  
    The prebuilt UML binaries on this site will not run on 2G/2G hosts
@@ -558,13 +507,14 @@
  
  
  
-  4.3.  The solution
+4.3.  The solution
+------------------
  
  
    The fix for this is to rebuild UML from source after enabling
    CONFIG_HOST_2G_2G (under 'General Setup').  This will cause UML to
    load itself in the top .5G of that smaller process address space,
-  where it will run fine.  See ``Compiling the kernel and modules''  if
+  where it will run fine.  See :ref:`Compiling_the_kernel_and_modules`  if
    you need help building UML from source.
  
  
@@ -573,10 +523,11 @@
  
  
  
+.. _setting_up_serial_lines_and_consoles:
  
  
-
-  5.  Setting up serial lines and consoles
+5.  Setting up serial lines and consoles
+========================================
  
  
    It is possible to attach UML serial lines and consoles to many types
@@ -584,22 +535,23 @@
  
  
    You can attach them to host ptys, ttys, file descriptors, and ports.
-  This allows you to do things like
+  This allows you to do things like:
  
-  o  have a UML console appear on an unused host console,
+  -  have a UML console appear on an unused host console,
  
-  o  hook two virtual machines together by having one attach to a pty
+  -  hook two virtual machines together by having one attach to a pty
       and having the other attach to the corresponding tty
  
-  o  make a virtual machine accessible from the net by attaching a
+  -  make a virtual machine accessible from the net by attaching a
       console to a port on the host.
  
  
-  The general format of the command line option is device=channel.
+  The general format of the command line option is ``device=channel``.
  
  
  
-  5.1.  Specifying the device
+5.1.  Specifying the device
+---------------------------
  
    Devices are specified with "con" or "ssl" (console or serial line,
    respectively), optionally with a device number if you are talking
@@ -613,7 +565,7 @@
  
    A specific device name will override a less general "con=" or "ssl=".
    So, for example, you can assign a pty to each of the serial lines
-  except for the first two like this:
+  except for the first two like this::
  
  
          ssl=pty ssl0=tty:/dev/tty0 ssl1=tty:/dev/tty1
@@ -626,13 +578,14 @@
  
  
  
-  5.2.  Specifying the channel
+5.2.  Specifying the channel
+----------------------------
  
    There are a number of different types of channels to attach a UML
    device to, each with a different way of specifying exactly what to
    attach to.
  
-  o  pseudo-terminals - device=pty pts terminals - device=pts
+  -  pseudo-terminals - device=pty pts terminals - device=pts
  
  
       This will cause UML to allocate a free host pseudo-terminal for the
@@ -640,23 +593,23 @@
       log.  You access it by attaching a terminal program to the
       corresponding tty:
  
-  o  screen /dev/pts/n
+  -  screen /dev/pts/n
  
-  o  screen /dev/ttyxx
+  -  screen /dev/ttyxx
  
-  o  minicom -o -p /dev/ttyxx - minicom seems not able to handle pts
+  -  minicom -o -p /dev/ttyxx - minicom seems not able to handle pts
       devices
  
-  o  kermit - start it up, 'open' the device, then 'connect'
+  -  kermit - start it up, 'open' the device, then 'connect'
  
  
  
  
  
-  o  terminals - device=tty:tty device file
+  -  terminals - device=tty:tty device file
  
  
-     This will make UML attach the device to the specified tty (i.e
+     This will make UML attach the device to the specified tty (i.e::
  
  
          con1=tty:/dev/tty3
@@ -672,7 +625,7 @@
  
  
  
-  o  xterms - device=xterm
+  -  xterms - device=xterm
  
  
       UML will run an xterm and the device will be attached to it.
@@ -681,12 +634,12 @@
  
  
  
-  o  Port - device=port:port number
+  -  Port - device=port:port number
  
  
       This will attach the UML devices to the specified host port.
       Attaching console 1 to the host's port 9000 would be done like
-     this:
+     this::
  
  
          con1=port:9000
@@ -694,7 +647,7 @@
  
  
  
-  Attaching all the serial lines to that port would be done similarly:
+  Attaching all the serial lines to that port would be done similarly::
  
  
          ssl=port:9000
@@ -702,8 +655,8 @@
  
  
  
-  You access these devices by telnetting to that port.  Each active tel-
-  net session gets a different device.  If there are more telnets to a
+  You access these devices by telnetting to that port.  Each active
+  telnet session gets a different device.  If there are more telnets to a
    port than UML devices attached to it, then the extra telnet sessions
    will block until an existing telnet detaches, or until another device
    becomes active (i.e. by being activated in /etc/inittab).
@@ -725,13 +678,13 @@
  
  
  
-  o  already-existing file descriptors - device=file descriptor
+  -  already-existing file descriptors - device=file descriptor
  
  
       If you set up a file descriptor on the UML command line, you can
       attach a UML device to it.  This is most commonly used to put the
       main console back on stdin and stdout after assigning all the other
-     consoles to something else:
+     consoles to something else::
  
  
          con0=fd:0,fd:1 con=pts
@@ -743,7 +696,7 @@
  
  
  
-  o  Nothing - device=null
+  -  Nothing - device=null
  
  
       This allows the device to be opened, in contrast to 'none', but
@@ -754,7 +707,7 @@
  
  
  
-  o  None - device=none
+  -  None - device=none
  
  
       This causes the device to disappear.
@@ -762,7 +715,7 @@
  
  
    You can also specify different input and output channels for a device
-  by putting a comma between them:
+  by putting a comma between them::
  
  
          ssl3=tty:/dev/tty2,xterm
@@ -785,14 +738,15 @@
  
  
  
-  5.3.  Examples
+5.3.  Examples
+--------------
  
    There are a number of interesting things you can do with this
    capability.
  
  
    First, this is how you get rid of those bleeding console xterms by
-  attaching them to host ptys:
+  attaching them to host ptys::
  
  
          con=pty con0=fd:0,fd:1
@@ -802,7 +756,7 @@
  
    This will make a UML console take over an unused host virtual console,
    so that when you switch to it, you will see the UML login prompt
-  rather than the host login prompt:
+  rather than the host login prompt::
  
  
          con1=tty:/dev/tty6
@@ -813,7 +767,7 @@
    You can attach two virtual machines together with what amounts to a
    serial line as follows:
  
-  Run one UML with a serial line attached to a pty -
+  Run one UML with a serial line attached to a pty::
  
  
          ssl1=pty
@@ -825,7 +779,7 @@
    that it got /dev/ptyp1).
  
    Boot the other UML with a serial line attached to the corresponding
-  tty -
+  tty::
  
  
          ssl1=tty:/dev/ttyp1
@@ -838,7 +792,10 @@
    prompt of the other virtual machine.
  
  
-  6.  Setting up the network
+.. _setting_up_the_network:
+
+6.  Setting up the network
+==========================
  
  
  
@@ -858,19 +815,19 @@
    There are currently five transport types available for a UML virtual
    machine to exchange packets with other hosts:
  
-  o  ethertap
+  -  ethertap
  
-  o  TUN/TAP
+  -  TUN/TAP
  
-  o  Multicast
+  -  Multicast
  
-  o  a switch daemon
+  -  a switch daemon
  
-  o  slip
+  -  slip
  
-  o  slirp
+  -  slirp
  
-  o  pcap
+  -  pcap
  
       The TUN/TAP, ethertap, slip, and slirp transports allow a UML
       instance to exchange packets with the host.  They may be directed
@@ -893,28 +850,28 @@
    With so many host transports, which one should you use?  Here's when
    you should use each one:
  
-  o  ethertap - if you want access to the host networking and it is
+  -  ethertap - if you want access to the host networking and it is
       running 2.2
  
-  o  TUN/TAP - if you want access to the host networking and it is
+  -  TUN/TAP - if you want access to the host networking and it is
       running 2.4.  Also, the TUN/TAP transport is able to use a
       preconfigured device, allowing it to avoid using the setuid uml_net
       helper, which is a security advantage.
  
-  o  Multicast - if you want a purely virtual network and you don't want
+  -  Multicast - if you want a purely virtual network and you don't want
       to set up anything but the UML
  
-  o  a switch daemon - if you want a purely virtual network and you
+  -  a switch daemon - if you want a purely virtual network and you
       don't mind running the daemon in order to get somewhat better
       performance
  
-  o  slip - there is no particular reason to run the slip backend unless
+  -  slip - there is no particular reason to run the slip backend unless
       ethertap and TUN/TAP are just not available for some reason
  
-  o  slirp - if you don't have root access on the host to setup
+  -  slirp - if you don't have root access on the host to setup
       networking, or if you don't want to allocate an IP to your UML
  
-  o  pcap - not much use for actual network connectivity, but great for
+  -  pcap - not much use for actual network connectivity, but great for
       monitoring traffic on the host
  
       Ethertap is available on 2.4 and works fine.  TUN/TAP is preferred
@@ -926,7 +883,8 @@
       exploit the helper's root privileges.
  
  
-  6.1.  General setup
+6.1.  General setup
+-------------------
  
    First, you must have the virtual network enabled in your UML.  If are
    running a prebuilt kernel from this site, everything is already
@@ -938,7 +896,7 @@
    The next step is to provide a network device to the virtual machine.
    This is done by describing it on the kernel command line.
  
-  The general format is
+  The general format is::
  
  
         eth <n> = <transport> , <transport args>
@@ -947,7 +905,7 @@
  
  
    For example, a virtual ethernet device may be attached to a host
-  ethertap device as follows:
+  ethertap device as follows::
  
  
         eth0=ethertap,tap0,fe:fd:0:0:0:1,192.168.0.254
@@ -978,7 +936,7 @@
  
  
    You can also add devices to a UML and remove them at runtime.  See the
-  ``The Management Console''  page for details.
+  :ref:`The_Management_Console`  page for details.
  
  
    The sections below describe this in more detail.
@@ -995,7 +953,8 @@
  
  
  
-  6.2.  Userspace daemons
+6.2.  Userspace daemons
+-----------------------
  
    You will likely need the setuid helper, or the switch daemon, or both.
    They are both installed with the RPM and deb, so if you've installed
@@ -1011,7 +970,8 @@
  
  
  
-  6.3.  Specifying ethernet addresses
+6.3.  Specifying ethernet addresses
+-----------------------------------
  
    Below, you will see that the TUN/TAP, ethertap, and daemon interfaces
    allow you to specify hardware addresses for the virtual ethernet
@@ -1023,21 +983,21 @@
    sufficient to guarantee a unique hardware address for the device.  A
    couple of exceptions are:
  
-  o  Another set of virtual ethernet devices are on the same network and
+  -  Another set of virtual ethernet devices are on the same network and
       they are assigned hardware addresses using a different scheme which
       may conflict with the UML IP address-based scheme
  
-  o  You aren't going to use the device for IP networking, so you don't
+  -  You aren't going to use the device for IP networking, so you don't
       assign the device an IP address
  
       If you let the driver provide the hardware address, you should make
       sure that the device IP address is known before the interface is
-     brought up.  So, inside UML, this will guarantee that:
+     brought up.  So, inside UML, this will guarantee that::
  
  
  
-  UML#
-  ifconfig eth0 192.168.0.250 up
+         UML#
+         ifconfig eth0 192.168.0.250 up
  
  
  
@@ -1049,13 +1009,14 @@
  
  
  
-  6.4.  UML interface setup
+6.4.  UML interface setup
+-------------------------
  
    Once the network devices have been described on the command line, you
    should boot UML and log in.
  
  
-  The first thing to do is bring the interface up:
+  The first thing to do is bring the interface up::
  
  
         UML# ifconfig ethn ip-address up
@@ -1067,7 +1028,7 @@
  
  
    To reach the rest of the world, you should set a default route to the
-  host:
+  host::
  
  
         UML# route add default gw host ip
@@ -1075,7 +1036,7 @@
  
  
  
-  Again, with host ip of 192.168.0.4:
+  Again, with host ip of 192.168.0.4::
  
  
         UML# route add default gw 192.168.0.4
@@ -1097,29 +1058,25 @@
    Note: If you can't communicate with other hosts on your physical
    ethernet, it's probably because of a network route that's
    automatically set up.  If you run 'route -n' and see a route that
-  looks like this:
+  looks like this::
  
  
  
  
-  Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
-  192.168.0.0     0.0.0.0         255.255.255.0   U     0      0      0   eth0
+    Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
+    192.168.0.0     0.0.0.0         255.255.255.0   U     0      0      0   eth0
  
  
  
  
    with a mask that's not 255.255.255.255, then replace it with a route
-  to your host:
+  to your host::
  
  
         UML#
         route del -net 192.168.0.0 dev eth0 netmask 255.255.255.0
  
  
-
-
-
-
         UML#
         route add -host 192.168.0.4 dev eth0
  
@@ -1131,7 +1088,8 @@
  
  
  
-  6.5.  Multicast
+6.5.  Multicast
+---------------
  
    The simplest way to set up a virtual network between multiple UMLs is
    to use the mcast transport.  This was written by Harald Welte and is
@@ -1142,7 +1100,7 @@
    messages when you bring the device up inside UML.
  
  
-  To use it, run two UMLs with
+  To use it, run two UMLs with::
  
  
          eth0=mcast
@@ -1151,16 +1109,12 @@
  
  
    on their command lines.  Log in, configure the ethernet device in each
-  machine with different IP addresses:
+  machine with different IP addresses::
  
  
         UML1# ifconfig eth0 192.168.0.254
  
  
-
-
-
-
         UML2# ifconfig eth0 192.168.0.253
  
  
@@ -1168,7 +1122,7 @@
  
    and they should be able to talk to each other.
  
-  The full set of command line options for this transport are
+  The full set of command line options for this transport are::
  
  
  
@@ -1177,16 +1131,11 @@
  
  
  
-
-  Harald's original README is here <http://user-mode-linux.source-
-  forge.net/>  and explains these in detail, as well as
-  some other issues.
-
    There is also a related point-to-point only "ucast" transport.
    This is useful when your network does not support multicast, and
    all network connections are simple point to point links.
  
-  The full set of command line options for this transport are
+  The full set of command line options for this transport are::
  
  
         ethn=ucast,ethernet address,remote address,listen port,remote port
@@ -1194,7 +1143,8 @@
  
  
  
-  6.6.  TUN/TAP with the uml_net helper
+6.6.  TUN/TAP with the uml_net helper
+-------------------------------------
  
    TUN/TAP is the preferred mechanism on 2.4 to exchange packets with the
    host.  The TUN/TAP backend has been in UML since 2.4.9-3um.
@@ -1216,7 +1166,7 @@
    kernel or as the tun.o module.
  
    The format of the command line switch to attach a device to a TUN/TAP
-  device is
+  device is::
  
  
         eth <n> =tuntap,,, <IP address>
@@ -1226,7 +1176,7 @@
  
    For example, this argument will attach the UML's eth0 to the next
    available tap device and assign an ethernet address to it based on its
-  IP address
+  IP address::
  
  
         eth0=tuntap,,,192.168.0.254
@@ -1247,10 +1197,10 @@
    There are a couple potential problems with running the TUN/TAP
    transport on a 2.4 host kernel
  
-  o  TUN/TAP seems not to work on 2.4.3 and earlier.  Upgrade the host
+  -  TUN/TAP seems not to work on 2.4.3 and earlier.  Upgrade the host
       kernel or use the ethertap transport.
  
-  o  With an upgraded kernel, TUN/TAP may fail with
+  -  With an upgraded kernel, TUN/TAP may fail with::
  
  
         File descriptor in bad state
@@ -1263,13 +1213,12 @@
    make sure that /usr/src/linux points to the headers for the running
    kernel.
  
-  These were pointed out by Tim Robinson <timro at trkr dot net> in
-  <http://www.geocrawler.com/> name="this uml-
-  user post"> .
+  These were pointed out by Tim Robinson <timro at trkr dot net> in the past.
  
  
  
-  6.7.  TUN/TAP with a preconfigured tap device
+6.7.  TUN/TAP with a preconfigured tap device
+---------------------------------------------
  
    If you prefer not to have UML use uml_net (which is somewhat
    insecure), with UML 2.4.17-11, you can set up a TUN/TAP device
@@ -1277,8 +1226,8 @@
    there is no need for root assistance.  Setting up the device is done
    as follows:
  
-  o  Create the device with tunctl (available from the UML utilities
-     tarball)
+  -  Create the device with tunctl (available from the UML utilities
+     tarball)::
  
  
  
@@ -1291,8 +1240,8 @@
    where uid is the user id or username that UML will be run as.  This
    will tell you what device was created.
  
-  o  Configure the device IP (change IP addresses and device name to
-     suit)
+  -  Configure the device IP (change IP addresses and device name to
+     suit)::
  
  
  
@@ -1303,8 +1252,8 @@
  
  
  
-  o  Set up routing and arping if desired - this is my recipe, there are
-     other ways of doing the same thing
+  -  Set up routing and arping if desired - this is my recipe, there are
+     other ways of doing the same thing::
  
  
         host#
@@ -1313,19 +1262,9 @@
         host#
         route add -host 192.168.0.253 dev tap0
  
-
-
-
-
-
         host#
         bash -c 'echo 1 > /proc/sys/net/ipv4/conf/tap0/proxy_arp'
  
-
-
-
-
-
         host#
         arp -Ds 192.168.0.253 eth0 pub
  
@@ -1338,76 +1277,43 @@
    utility which reads the information from a config file and sets up
    devices at boot time.
  
-  o  Rather than using up two IPs and ARPing for one of them, you can
+  -  Rather than using up two IPs and ARPing for one of them, you can
       also provide direct access to your LAN by the UML by using a
-     bridge.
+     bridge::
  
  
         host#
         brctl addbr br0
  
  
-
-
-
-
         host#
         ifconfig eth0 0.0.0.0 promisc up
  
  
-
-
-
-
         host#
         ifconfig tap0 0.0.0.0 promisc up
  
  
-
-
-
-
         host#
         ifconfig br0 192.168.0.1 netmask 255.255.255.0 up
  
  
-
-
-
-
-
-  host#
-  brctl stp br0 off
-
-
-
-
+       host#
+       brctl stp br0 off
  
  
         host#
         brctl setfd br0 1
  
  
-
-
-
-
         host#
         brctl sethello br0 1
  
  
-
-
-
-
         host#
         brctl addif br0 eth0
  
  
-
-
-
-
         host#
         brctl addif br0 tap0
  
@@ -1417,12 +1323,12 @@
    Note that 'br0' should be setup using ifconfig with the existing IP
    address of eth0, as eth0 no longer has its own IP.
  
-  o
+  -
  
  
       Also, the /dev/net/tun device must be writable by the user running
       UML in order for the UML to use the device that's been configured
-     for it.  The simplest thing to do is
+     for it.  The simplest thing to do is::
  
  
         host#  chmod 666 /dev/net/tun
@@ -1438,14 +1344,14 @@
    devices and chgrp /dev/net/tun to that group with mode 664 or 660.
  
  
-  o  Once the device is set up, run UML with 'eth0=tuntap,device name'
+  -  Once the device is set up, run UML with 'eth0=tuntap,device name'
       (i.e. 'eth0=tuntap,tap0') on the command line (or do it with the
       mconsole config command).
  
-  o  Bring the eth device up in UML and you're in business.
+  -  Bring the eth device up in UML and you're in business.
  
       If you don't want that tap device any more, you can make it non-
-     persistent with
+     persistent with::
  
  
         host#  tunctl -d tap device
@@ -1455,7 +1361,7 @@
  
    Finally, tunctl has a -b (for brief mode) switch which causes it to
    output only the name of the tap device it created.  This makes it
-  suitable for capture by a script:
+  suitable for capture by a script::
  
  
         host#  TAP=`tunctl -u 1000 -b`
@@ -1465,7 +1371,8 @@
  
  
  
-  6.8.  Ethertap
+6.8.  Ethertap
+--------------
  
    Ethertap is the general mechanism on 2.2 for userspace processes to
    exchange packets with the kernel.
@@ -1473,7 +1380,7 @@
  
  
    To use this transport, you need to describe the virtual network device
-  on the UML command line.  The general format for this is
+  on the UML command line.  The general format for this is::
  
  
         eth <n> =ethertap, <device> , <ethernet address> , <tap IP address>
@@ -1481,7 +1388,7 @@
  
  
  
-  So, the previous example
+  So, the previous example::
  
  
         eth0=ethertap,tap0,fe:fd:0:0:0:1,192.168.0.254
@@ -1521,7 +1428,7 @@
  
    If you want to set things up yourself, you need to make sure that the
    appropriate /dev entry exists.  If it doesn't, become root and create
-  it as follows:
+  it as follows::
  
  
         mknod /dev/tap <minor>  c 36  <minor>  + 16
@@ -1529,7 +1436,7 @@
  
  
  
-  For example, this is how to create /dev/tap0:
+  For example, this is how to create /dev/tap0::
  
  
         mknod /dev/tap0 c 36 0 + 16
@@ -1539,7 +1446,7 @@
  
    You also need to make sure that the host kernel has ethertap support.
    If ethertap is enabled as a module, you apparently need to insmod
-  ethertap once for each ethertap device you want to enable.  So,
+  ethertap once for each ethertap device you want to enable.  So,::
  
  
         host#
@@ -1549,7 +1456,7 @@
  
  
    will give you the tap0 interface.  To get the tap1 interface, you need
-  to run
+  to run::
  
  
         host#
@@ -1561,7 +1468,8 @@
  
  
  
-  6.9.  The switch daemon
+6.9.  The switch daemon
+-----------------------
  
    Note: This is the daemon formerly known as uml_router, but which was
    renamed so the network weenies of the world would stop growling at me.
@@ -1577,7 +1485,7 @@
    sockets.
  
  
-  If you want it to listen on a different pair of sockets, use
+  If you want it to listen on a different pair of sockets, use::
  
  
          -unix control socket data socket
@@ -1586,7 +1494,7 @@
  
  
  
-  If you want it to act as a hub rather than a switch, use
+  If you want it to act as a hub rather than a switch, use::
  
  
          -hub
@@ -1596,7 +1504,7 @@
  
  
    If you want the switch to be connected to host networking (allowing
-  the umls to get access to the outside world through the host), use
+  the umls to get access to the outside world through the host), use::
  
  
          -tap tap0
@@ -1610,7 +1518,7 @@
    device than tap0, specify that instead of tap0.
  
  
-  uml_switch can be backgrounded as follows
+  uml_switch can be backgrounded as follows::
  
  
         host%
@@ -1623,7 +1531,7 @@
    stdin for EOF.  When it sees that, it exits.
  
  
-  The general format of the kernel command line switch is
+  The general format of the kernel command line switch is::
  
  
  
@@ -1639,7 +1547,8 @@
    how to communicate with the daemon.  You should only specify them if
    you told the daemon to use different sockets than the default.  So, if
    you ran the daemon with no arguments, running the UML on the same
-  machine with
+  machine with::
+
         eth0=daemon
  
  
@@ -1649,7 +1558,8 @@
  
  
  
-  6.10.  Slip
+6.10.  Slip
+-----------
  
    Slip is another, less general, mechanism for a process to communicate
    with the host networking.  In contrast to the ethertap interface,
@@ -1658,7 +1568,7 @@
    IP.
  
  
-  The general format of the command line switch is
+  The general format of the command line switch is::
  
  
  
@@ -1681,7 +1591,8 @@
  
  
  
-  6.11.  Slirp
+6.11.  Slirp
+------------
  
    slirp uses an external program, usually /usr/bin/slirp, to provide IP
    only networking connectivity through the host. This is similar to IP
@@ -1691,7 +1602,7 @@
    root access or setuid binaries on the host.
  
  
-  The general format of the command line switch for slirp is:
+  The general format of the command line switch for slirp is::
  
  
  
@@ -1716,7 +1627,7 @@
    The eth0 interface on UML should be set up with the IP 10.2.0.15,
    although you can use anything as long as it is not used by a network
    you will be connecting to. The default route on UML should be set to
-  use
+  use::
  
  
         UML#
@@ -1737,10 +1648,11 @@
  
  
  
-  6.12.  pcap
+6.12.  pcap
+-----------
  
    The pcap transport is attached to a UML ethernet device on the command
-  line or with uml_mconsole with the following syntax:
+  line or with uml_mconsole with the following syntax::
  
  
  
@@ -1762,7 +1674,7 @@
    expression optimizer is used.
  
  
-  Example:
+  Example::
  
  
  
@@ -1777,7 +1689,8 @@
  
  
  
-  6.13.  Setting up the host yourself
+6.13.  Setting up the host yourself
+-----------------------------------
  
    If you don't specify an address for the host side of the ethertap or
    slip device, UML won't do any setup on the host.  So this is what is
@@ -1785,19 +1698,15 @@
    192.168.0.251 and a UML-side IP of 192.168.0.250 - adjust to suit your
    own network):
  
-  o  The device needs to be configured with its IP address.  Tap devices
+  -  The device needs to be configured with its IP address.  Tap devices
       are also configured with an mtu of 1484.  Slip devices are
       configured with a point-to-point address pointing at the UML ip
-     address.
+     address::
  
  
         host#  ifconfig tap0 arp mtu 1484 192.168.0.251 up
  
  
-
-
-
-
         host#
         ifconfig sl0 192.168.0.251 pointopoint 192.168.0.250 up
  
@@ -1805,7 +1714,7 @@
  
  
  
-  o  If a tap device is being set up, a route is set to the UML IP.
+  -  If a tap device is being set up, a route is set to the UML IP::
  
  
         UML# route add -host 192.168.0.250 gw 192.168.0.251
@@ -1814,8 +1723,8 @@
  
  
  
-  o  To allow other hosts on your network to see the virtual machine,
-     proxy arp is set up for it.
+  -  To allow other hosts on your network to see the virtual machine,
+     proxy arp is set up for it::
  
  
         host#  arp -Ds 192.168.0.250 eth0 pub
@@ -1824,7 +1733,7 @@
  
  
  
-  o  Finally, the host is set up to route packets.
+  -  Finally, the host is set up to route packets::
  
  
         host#  echo 1 > /proc/sys/net/ipv4/ip_forward
@@ -1838,12 +1747,14 @@
  
  
  
-  7.  Sharing Filesystems between Virtual Machines
+7.  Sharing Filesystems between Virtual Machines
+================================================
  
  
  
  
-  7.1.  A warning
+7.1.  A warning
+---------------
  
    Don't attempt to share filesystems simply by booting two UMLs from the
    same file.  That's the same thing as booting two physical machines
@@ -1851,7 +1762,8 @@
  
  
  
-  7.2.  Using layered block devices
+7.2.  Using layered block devices
+---------------------------------
  
    The way to share a filesystem between two virtual machines is to use
    the copy-on-write (COW) layering capability of the ubd block driver.
@@ -1872,7 +1784,7 @@
  
  
    To add a copy-on-write layer to an existing block device file, simply
-  add the name of the COW file to the appropriate ubd switch:
+  add the name of the COW file to the appropriate ubd switch::
  
  
          ubd0=root_fs_cow,root_fs_debian_22
@@ -1883,7 +1795,7 @@
    where 'root_fs_cow' is the private COW file and 'root_fs_debian_22' is
    the existing shared filesystem.  The COW file need not exist.  If it
    doesn't, the driver will create and initialize it.  Once the COW file
-  has been initialized, it can be used on its own on the command line:
+  has been initialized, it can be used on its own on the command line::
  
  
          ubd0=root_fs_cow
@@ -1896,14 +1808,16 @@
  
  
  
-  7.3.  Note!
+7.3.  Note!
+-----------
  
    When checking the size of the COW file in order to see the gobs of
    space that you're saving, make sure you use 'ls -ls' to see the actual
    disk consumption rather than the length of the file.  The COW file is
    sparse, so the length will be very different from the disk usage.
    Here is a 'ls -l' of a COW file and backing file from one boot and
-  shutdown:
+  shutdown::
+
         host% ls -l cow.debian debian2.2
         -rw-r--r--    1 jdike    jdike    492504064 Aug  6 21:16 cow.debian
         -rwxrw-rw-    1 jdike    jdike    537919488 Aug  6 20:42 debian2.2
@@ -1911,7 +1825,7 @@
  
  
  
-  Doesn't look like much saved space, does it?  Well, here's 'ls -ls':
+  Doesn't look like much saved space, does it?  Well, here's 'ls -ls'::
  
  
         host% ls -ls cow.debian debian2.2
@@ -1926,7 +1840,8 @@
  
  
  
-  7.4.  Another warning
+7.4.  Another warning
+---------------------
  
    Once a filesystem is being used as a readonly backing file for a COW
    file, do not boot directly from it or modify it in any way.  Doing so
@@ -1952,7 +1867,8 @@
  
  
  
-  7.5.  uml_moo : Merging a COW file with its backing file
+7.5.  uml_moo : Merging a COW file with its backing file
+--------------------------------------------------------
  
    Depending on how you use UML and COW devices, it may be advisable to
    merge the changes in the COW file into the backing file every once in
@@ -1961,7 +1877,7 @@
  
  
  
-  The utility that does this is uml_moo.  Its usage is
+  The utility that does this is uml_moo.  Its usage is::
  
  
         host% uml_moo COW file new backing file
@@ -1991,8 +1907,8 @@
  
    uml_moo is installed with the UML deb and RPM.  If you didn't install
    UML from one of those packages, you can also get it from the UML
-  utilities <http://user-mode-linux.sourceforge.net/
-  utilities>  tar file in tools/moo.
+  utilities http://user-mode-linux.sourceforge.net/utilities tar file
+  in tools/moo.
  
  
  
@@ -2001,7 +1917,8 @@
  
  
  
-  8.  Creating filesystems
+8.  Creating filesystems
+========================
  
  
    You may want to create and mount new UML filesystems, either because
@@ -2015,13 +1932,14 @@
    should be easy to translate to the filesystem of your choice.
  
  
-  8.1.  Create the filesystem file
+8.1.  Create the filesystem file
+================================
  
    dd is your friend.  All you need to do is tell dd to create an empty
    file of the appropriate size.  I usually make it sparse to save time
    and to avoid allocating disk space until it's actually used.  For
    example, the following command will create a sparse 100 meg file full
-  of zeroes.
+  of zeroes::
  
  
         host%
@@ -2034,9 +1952,9 @@
  
    8.2.  Assign the file to a UML device
  
-  Add an argument like the following to the UML command line:
+  Add an argument like the following to the UML command line::
  
-  ubd4=new_filesystem
+       ubd4=new_filesystem
  
  
  
@@ -2053,7 +1971,7 @@
    etc), then get them into UML by way of the net or hostfs.
  
  
-  Make the new filesystem on the device assigned to the new file:
+  Make the new filesystem on the device assigned to the new file::
  
  
         host#  mkreiserfs /dev/ubd/4
@@ -2077,7 +1995,7 @@
  
  
  
-  Now, mount it:
+  Now, mount it::
  
  
         UML#
@@ -2096,7 +2014,8 @@
  
  
  
-  9.  Host file access
+9.  Host file access
+====================
  
  
    If you want to access files on the host machine from inside UML, you
@@ -2112,10 +2031,11 @@
    files contained in it just as you would on the host.
  
  
-  9.1.  Using hostfs
+9.1.  Using hostfs
+------------------
  
    To begin with, make sure that hostfs is available inside the virtual
-  machine with
+  machine with::
  
  
         UML# cat /proc/filesystems
@@ -2127,7 +2047,7 @@
    module and available inside the virtual machine, and insmod it.
  
  
-  Now all you need to do is run mount:
+  Now all you need to do is run mount::
  
  
         UML# mount none /mnt/host -t hostfs
@@ -2139,7 +2059,7 @@
  
  
    If you don't want to mount the host root directory, then you can
-  specify a subdirectory to mount with the -o switch to mount:
+  specify a subdirectory to mount with the -o switch to mount::
  
  
         UML# mount none /mnt/home -t hostfs -o /home
@@ -2151,13 +2071,14 @@
  
  
  
-  9.2.  hostfs as the root filesystem
+9.2.  hostfs as the root filesystem
+-----------------------------------
  
    It's possible to boot from a directory hierarchy on the host using
    hostfs rather than using the standard filesystem in a file.
  
    To start, you need that hierarchy.  The easiest way is to loop mount
-  an existing root_fs file:
+  an existing root_fs file::
  
  
         host#  mount root_fs uml_root_dir -o loop
@@ -2166,15 +2087,15 @@
  
  
    You need to change the filesystem type of / in etc/fstab to be
-  'hostfs', so that line looks like this:
+  'hostfs', so that line looks like this::
  
-  /dev/ubd/0       /        hostfs      defaults          1   1
+    /dev/ubd/0       /        hostfs      defaults          1   1
  
  
  
  
    Then you need to chown to yourself all the files in that directory
-  that are owned by root.  This worked for me:
+  that are owned by root.  This worked for me::
  
  
         host#  find . -uid 0 -exec chown jdike {} \;
@@ -2183,7 +2104,7 @@
  
  
    Next, make sure that your UML kernel has hostfs compiled in, not as a
-  module.  Then run UML with the boot device pointing at that directory:
+  module.  Then run UML with the boot device pointing at that directory::
  
  
          ubd0=/path/to/uml/root/directory
@@ -2194,41 +2115,35 @@
    UML should then boot as it does normally.
  
  
-  9.3.  Building hostfs
+9.3.  Building hostfs
+---------------------
  
    If you need to build hostfs because it's not in your kernel, you have
    two choices:
  
  
  
-  o  Compiling hostfs into the kernel:
+  -  Compiling hostfs into the kernel:
  
  
       Reconfigure the kernel and set the 'Host filesystem' option under
  
  
-  o  Compiling hostfs as a module:
+  -  Compiling hostfs as a module:
  
  
       Reconfigure the kernel and set the 'Host filesystem' option under
       be in arch/um/fs/hostfs/hostfs.o.  Install that in
-     /lib/modules/`uname -r`/fs in the virtual machine, boot it up, and
+     ``/lib/modules/$(uname -r)/fs`` in the virtual machine, boot it up, and::
  
  
         UML# insmod hostfs
  
  
+.. _The_Management_Console:
  
-
-
-
-
-
-
-
-
-
-  10.  The Management Console
+10.  The Management Console
+===========================
  
  
  
@@ -2240,15 +2155,15 @@
  
    There are a number of things you can do with the mconsole interface:
  
-  o  get the kernel version
+  -  get the kernel version
  
-  o  add and remove devices
+  -  add and remove devices
  
-  o  halt or reboot the machine
+  -  halt or reboot the machine
  
-  o  Send SysRq commands
+  -  Send SysRq commands
  
-  o  Pause and resume the UML
+  -  Pause and resume the UML
  
  
    You need the mconsole client (uml_mconsole) which is present in CVS
@@ -2257,7 +2172,7 @@
  
  
    You also need CONFIG_MCONSOLE (under 'General Setup') enabled in UML.
-  When you boot UML, you'll see a line like:
+  When you boot UML, you'll see a line like::
  
  
         mconsole initialized on /home/jdike/.uml/umlNJ32yL/mconsole
@@ -2265,7 +2180,7 @@
  
  
  
-  If you specify a unique machine id one the UML command line, i.e.
+  If you specify a unique machine id one the UML command line, i.e.::
  
  
          umid=debian
@@ -2273,7 +2188,7 @@
  
  
  
-  you'll see this
+  you'll see this::
  
  
         mconsole initialized on /home/jdike/.uml/debian/mconsole
@@ -2282,7 +2197,7 @@
  
  
    That file is the socket that uml_mconsole will use to communicate with
-  UML.  Run it with either the umid or the full path as its argument:
+  UML.  Run it with either the umid or the full path as its argument::
  
  
         host% uml_mconsole debian
@@ -2290,7 +2205,7 @@
  
  
  
-  or
+  or::
  
  
         host% uml_mconsole /home/jdike/.uml/debian/mconsole
@@ -2300,30 +2215,31 @@
  
    You'll get a prompt, at which you can run one of these commands:
  
-  o  version
+  -  version
  
-  o  halt
+  -  halt
  
-  o  reboot
+  -  reboot
  
-  o  config
+  -  config
  
-  o  remove
+  -  remove
  
-  o  sysrq
+  -  sysrq
  
-  o  help
+  -  help
  
-  o  cad
+  -  cad
  
-  o  stop
+  -  stop
  
-  o  go
+  -  go
  
  
-  10.1.  version
+10.1.  version
+--------------
  
-  This takes no arguments.  It prints the UML version.
+  This takes no arguments.  It prints the UML version::
  
  
         (mconsole)  version
@@ -2342,11 +2258,12 @@
  
  
  
-  10.2.  halt and reboot
+10.2.  halt and reboot
+----------------------
  
    These take no arguments.  They shut the machine down immediately, with
    no syncing of disks and no clean shutdown of userspace.  So, they are
-  pretty close to crashing the machine.
+  pretty close to crashing the machine::
  
  
         (mconsole)  halt
@@ -2357,34 +2274,36 @@
  
  
  
-  10.3.  config
+10.3.  config
+-------------
  
    "config" adds a new device to the virtual machine.  Currently the ubd
    and network drivers support this.  It takes one argument, which is the
-  device to add, with the same syntax as the kernel command line.
+  device to add, with the same syntax as the kernel command line::
  
  
  
  
-  (mconsole)
-  config ubd3=/home/jdike/incoming/roots/root_fs_debian22
+       (mconsole)
+       config ubd3=/home/jdike/incoming/roots/root_fs_debian22
  
-  OK
-  (mconsole)  config eth1=mcast
-  OK
+       OK
+       (mconsole)  config eth1=mcast
+       OK
  
  
  
  
  
  
-  10.4.  remove
+10.4.  remove
+-------------
  
    "remove" deletes a device from the system.  Its argument is just the
    name of the device to be removed. The device must be idle in whatever
    sense the driver considers necessary.  In the case of the ubd driver,
    the removed block device must not be mounted, swapped on, or otherwise
-  open, and in the case of the network driver, the device must be down.
+  open, and in the case of the network driver, the device must be down::
  
  
         (mconsole)  remove ubd3
@@ -2397,7 +2316,8 @@
  
  
  
-  10.5.  sysrq
+10.5.  sysrq
+------------
  
    This takes one argument, which is a single letter.  It calls the
    generic kernel's SysRq driver, which does whatever is called for by
@@ -2407,19 +2327,21 @@
  
  
  
-  10.6.  help
+10.6.  help
+-----------
  
    "help" returns a string listing the valid commands and what each one
    does.
  
  
  
-  10.7.  cad
+10.7.  cad
+----------
  
    This invokes the Ctl-Alt-Del action on init.  What exactly this ends
    up doing is up to /etc/inittab.  Normally, it reboots the machine.
    With UML, this is usually not desired, so if a halt would be better,
-  then find the section of inittab that looks like this
+  then find the section of inittab that looks like this::
  
  
         # What to do when CTRL-ALT-DEL is pressed.
@@ -2432,7 +2354,8 @@
  
  
  
-  10.8.  stop
+10.8.  stop
+-----------
  
    This puts the UML in a loop reading mconsole requests until a 'go'
    mconsole command is received. This is very useful for making backups
@@ -2448,7 +2371,8 @@
  
  
  
-  10.9.  go
+10.9.  go
+---------
  
    This resumes a UML after being paused by a 'stop' command. Note that
    when the UML has resumed, TCP connections may have timed out and if
@@ -2460,9 +2384,10 @@
  
  
  
+.. _Kernel_debugging:
  
-
-  11.  Kernel debugging
+11.  Kernel debugging
+=====================
  
  
    Note: The interface that makes debugging, as described here, possible
@@ -2477,15 +2402,16 @@
  
  
    In order to debug the kernel, you need build it from source.  See
-  ``Compiling the kernel and modules''  for information on doing that.
+  :ref:`Compiling_the_kernel_and_modules`  for information on doing that.
    Make sure that you enable CONFIG_DEBUGSYM and CONFIG_PT_PROXY during
-  the config.  These will compile the kernel with -g, and enable the
+  the config.  These will compile the kernel with ``-g``, and enable the
    ptrace proxy so that gdb works with UML, respectively.
  
  
  
  
-  11.1.  Starting the kernel under gdb
+11.1.  Starting the kernel under gdb
+------------------------------------
  
    You can have the kernel running under the control of gdb from the
    beginning by putting 'debug' on the command line.  You will get an
@@ -2498,7 +2424,11 @@
    There is a transcript of a debugging session  here <debug-
    session.html> , with breakpoints being set in the scheduler and in an
    interrupt handler.
-  11.2.  Examining sleeping processes
+
+
+11.2.  Examining sleeping processes
+-----------------------------------
+
  
    Not every bug is evident in the currently running process.  Sometimes,
    processes hang in the kernel when they shouldn't because they've
@@ -2516,7 +2446,7 @@
  
    Now what you do is this:
  
-  o  detach from the current thread
+  -  detach from the current thread::
  
  
         (UML gdb)  det
@@ -2525,7 +2455,7 @@
  
  
  
-  o  attach to the thread you are interested in
+  -  attach to the thread you are interested in::
  
  
         (UML gdb)  att <host pid>
@@ -2534,7 +2464,7 @@
  
  
  
-  o  look at its stack and anything else of interest
+  -  look at its stack and anything else of interest::
  
  
         (UML gdb)  bt
@@ -2545,18 +2475,14 @@
    Note that you can't do anything at this point that requires that a
    process execute, e.g. calling a function
  
-  o  when you're done looking at that process, reattach to the current
-     thread and continue it
+  -  when you're done looking at that process, reattach to the current
+     thread and continue it::
  
  
         (UML gdb)
         att 1
  
  
-
-
-
-
         (UML gdb)
         c
  
@@ -2569,12 +2495,13 @@
  
  
  
-  11.3.  Running ddd on UML
+11.3.  Running ddd on UML
+-------------------------
  
    ddd works on UML, but requires a special kludge.  The process goes
    like this:
  
-  o  Start ddd
+  -  Start ddd::
  
  
         host% ddd linux
@@ -2583,14 +2510,14 @@
  
  
  
-  o  With ps, get the pid of the gdb that ddd started.  You can ask the
+  -  With ps, get the pid of the gdb that ddd started.  You can ask the
       gdb to tell you, but for some reason that confuses things and
       causes a hang.
  
-  o  run UML with 'debug=parent gdb-pid=<pid>' added to the command line
+  -  run UML with 'debug=parent gdb-pid=<pid>' added to the command line
       - it will just sit there after you hit return
  
-  o  type 'att 1' to the ddd gdb and you will see something like
+  -  type 'att 1' to the ddd gdb and you will see something like::
  
  
         0xa013dc51 in __kill ()
@@ -2602,12 +2529,14 @@
  
  
  
-  o  At this point, type 'c', UML will boot up, and you can use ddd just
+  -  At this point, type 'c', UML will boot up, and you can use ddd just
       as you do on any other process.
  
  
  
-  11.4.  Debugging modules
+11.4.  Debugging modules
+------------------------
+
  
    gdb has support for debugging code which is dynamically loaded into
    the process.  This support is what is needed to debug kernel modules
@@ -2629,7 +2558,8 @@
  
  
    First, you must tell it where your modules are.  There is a list in
-  the script that looks like this:
+  the script that looks like this::
+
         set MODULE_PATHS {
         "fat" "/usr/src/uml/linux-2.4.18/fs/fat/fat.o"
         "isofs" "/usr/src/uml/linux-2.4.18/fs/isofs/isofs.o"
@@ -2641,9 +2571,7 @@
  
    You change that to list the names and paths of the modules that you
    are going to debug.  Then you run it from the toplevel directory of
-  your UML pool and it basically tells you what to do:
-
-
+  your UML pool and it basically tells you what to do::
  
  
                     ******** GDB pid is 21903 ********
@@ -2666,7 +2594,7 @@
  
  
    After you run UML and it sits there doing nothing, you hit return at
-  the 'att 1' and continue it:
+  the 'att 1' and continue it::
  
  
         Attaching to program: /home/jdike/linux/2.4/um/./linux, process 1
@@ -2678,63 +2606,48 @@
  
  
    At this point, you debug normally.  When you insmod something, the
-  expect magic will kick in and you'll see something like:
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-   *** Module hostfs loaded ***
-  Breakpoint 1, sys_init_module (name_user=0x805abb0 "hostfs",
-      mod_user=0x8070e00) at module.c:349
-  349             char *name, *n_name, *name_tmp = NULL;
-  (UML gdb)  finish
-  Run till exit from #0  sys_init_module (name_user=0x805abb0 "hostfs",
-      mod_user=0x8070e00) at module.c:349
-  0xa00e2e23 in execute_syscall (r=0xa8140284) at syscall_kern.c:411
-  411             else res = EXECUTE_SYSCALL(syscall, regs);
-  Value returned is $1 = 0
-  (UML gdb)
-  p/x (int)module_list + module_list->size_of_struct
-
-  $2 = 0xa9021054
-  (UML gdb)  symbol-file ./linux
-  Load new symbol table from "./linux"? (y or n) y
-  Reading symbols from ./linux...
-  done.
-  (UML gdb)
-  add-symbol-file /home/jdike/linux/2.4/um/arch/um/fs/hostfs/hostfs.o 0xa9021054
-
-  add symbol table from file "/home/jdike/linux/2.4/um/arch/um/fs/hostfs/hostfs.o" at
-          .text_addr = 0xa9021054
-   (y or n) y
-
-  Reading symbols from /home/jdike/linux/2.4/um/arch/um/fs/hostfs/hostfs.o...
-  done.
-  (UML gdb)  p *module_list
-  $1 = {size_of_struct = 84, next = 0xa0178720, name = 0xa9022de0 "hostfs",
-    size = 9016, uc = {usecount = {counter = 0}, pad = 0}, flags = 1,
-    nsyms = 57, ndeps = 0, syms = 0xa9023170, deps = 0x0, refs = 0x0,
-    init = 0xa90221f0 <init_hostfs>, cleanup = 0xa902222c <exit_hostfs>,
-    ex_table_start = 0x0, ex_table_end = 0x0, persist_start = 0x0,
-    persist_end = 0x0, can_unload = 0, runsize = 0, kallsyms_start = 0x0,
-    kallsyms_end = 0x0,
-    archdata_start = 0x1b855 <Address 0x1b855 out of bounds>,
-    archdata_end = 0xe5890000 <Address 0xe5890000 out of bounds>,
-    kernel_data = 0xf689c35d <Address 0xf689c35d out of bounds>}
-  >> Finished loading symbols for hostfs ...
+  expect magic will kick in and you'll see something like::
+
+
+     *** Module hostfs loaded ***
+    Breakpoint 1, sys_init_module (name_user=0x805abb0 "hostfs",
+        mod_user=0x8070e00) at module.c:349
+    349             char *name, *n_name, *name_tmp = NULL;
+    (UML gdb)  finish
+    Run till exit from #0  sys_init_module (name_user=0x805abb0 "hostfs",
+        mod_user=0x8070e00) at module.c:349
+    0xa00e2e23 in execute_syscall (r=0xa8140284) at syscall_kern.c:411
+    411             else res = EXECUTE_SYSCALL(syscall, regs);
+    Value returned is $1 = 0
+    (UML gdb)
+    p/x (int)module_list + module_list->size_of_struct
+
+    $2 = 0xa9021054
+    (UML gdb)  symbol-file ./linux
+    Load new symbol table from "./linux"? (y or n) y
+    Reading symbols from ./linux...
+    done.
+    (UML gdb)
+    add-symbol-file /home/jdike/linux/2.4/um/arch/um/fs/hostfs/hostfs.o 0xa9021054
+
+    add symbol table from file "/home/jdike/linux/2.4/um/arch/um/fs/hostfs/hostfs.o" at
+            .text_addr = 0xa9021054
+     (y or n) y
+
+    Reading symbols from /home/jdike/linux/2.4/um/arch/um/fs/hostfs/hostfs.o...
+    done.
+    (UML gdb)  p *module_list
+    $1 = {size_of_struct = 84, next = 0xa0178720, name = 0xa9022de0 "hostfs",
+      size = 9016, uc = {usecount = {counter = 0}, pad = 0}, flags = 1,
+      nsyms = 57, ndeps = 0, syms = 0xa9023170, deps = 0x0, refs = 0x0,
+      init = 0xa90221f0 <init_hostfs>, cleanup = 0xa902222c <exit_hostfs>,
+      ex_table_start = 0x0, ex_table_end = 0x0, persist_start = 0x0,
+      persist_end = 0x0, can_unload = 0, runsize = 0, kallsyms_start = 0x0,
+      kallsyms_end = 0x0,
+      archdata_start = 0x1b855 <Address 0x1b855 out of bounds>,
+      archdata_end = 0xe5890000 <Address 0xe5890000 out of bounds>,
+      kernel_data = 0xf689c35d <Address 0xf689c35d out of bounds>}
+    >> Finished loading symbols for hostfs ...
  
  
  
@@ -2744,7 +2657,7 @@
  
  
    Boot the kernel under the debugger and load the module with insmod or
-  modprobe.  With gdb, do:
+  modprobe.  With gdb, do::
  
  
         (UML gdb)  p module_list
@@ -2758,12 +2671,12 @@
    the name fields until find the module you want to debug.  Take the
    address of that structure, and add module.size_of_struct (which in
    2.4.10 kernels is 96 (0x60)) to it.  Gdb can make this hard addition
-  for you :-):
+  for you :-)::
  
  
  
-  (UML gdb)
-  printf "%#x\n", (int)module_list module_list->size_of_struct
+       (UML gdb)
+       printf "%#x\n", (int)module_list module_list->size_of_struct
  
  
  
@@ -2771,7 +2684,7 @@
    The offset from the module start occasionally changes (before 2.4.0,
    it was module.size_of_struct + 4), so it's a good idea to check the
    init and cleanup addresses once in a while, as describe below.  Now
-  do:
+  do::
  
  
         (UML gdb)
@@ -2786,7 +2699,7 @@
    If there's any doubt that you got the offset right, like breakpoints
    appear not to work, or they're appearing in the wrong place, you can
    check it by looking at the module structure.  The init and cleanup
-  fields should look like:
+  fields should look like::
  
  
         init = 0x588066b0 <init_hostfs>, cleanup = 0x588066c0 <exit_hostfs>
@@ -2801,7 +2714,7 @@
  
    When you want to load in a new version of the module, you need to get
    gdb to forget about the old one.  The only way I've found to do that
-  is to tell gdb to forget about all symbols that it knows about:
+  is to tell gdb to forget about all symbols that it knows about::
  
  
         (UML gdb)  symbol-file
@@ -2809,7 +2722,7 @@
  
  
  
-  Then reload the symbols from the kernel binary:
+  Then reload the symbols from the kernel binary::
  
  
         (UML gdb)  symbol-file /path/to/kernel
@@ -2823,17 +2736,19 @@
  
  
  
-  11.5.  Attaching gdb to the kernel
+11.5.  Attaching gdb to the kernel
+----------------------------------
  
    If you don't have the kernel running under gdb, you can attach gdb to
    it later by sending the tracing thread a SIGUSR1.  The first line of
-  the console output identifies its pid:
+  the console output identifies its pid::
+
         tracing thread pid = 20093
  
  
  
  
-  When you send it the signal:
+  When you send it the signal::
  
  
         host% kill -USR1 20093
@@ -2845,7 +2760,7 @@
  
  
    If you have the mconsole compiled into UML, then the mconsole client
-  can be used to start gdb:
+  can be used to start gdb::
  
  
         (mconsole)  (mconsole) config gdb=xterm
@@ -2857,7 +2772,8 @@
  
  
  
-  11.6.  Using alternate debuggers
+11.6.  Using alternate debuggers
+--------------------------------
  
    UML has support for attaching to an already running debugger rather
    than starting gdb itself.  This is present in CVS as of 17 Apr 2001.
@@ -2886,7 +2802,7 @@
    An example of an alternate debugger is strace.  You can strace the
    actual kernel as follows:
  
-  o  Run the following in a shell
+  -  Run the following in a shell::
  
  
         host%
@@ -2894,13 +2810,13 @@
  
  
  
-  o  Run UML with 'debug' and 'gdb-pid=<pid>' with the pid printed out
+  -  Run UML with 'debug' and 'gdb-pid=<pid>' with the pid printed out
       by the previous command
  
-  o  Hit return in the shell, and UML will start running, and strace
+  -  Hit return in the shell, and UML will start running, and strace
       output will start accumulating in the output file.
  
-     Note that this is different from running
+     Note that this is different from running::
  
  
         host% strace ./linux
@@ -2917,95 +2833,57 @@
  
  
  
-  12.  Kernel debugging examples
+12.  Kernel debugging examples
+==============================
  
-  12.1.  The case of the hung fsck
+12.1.  The case of the hung fsck
+--------------------------------
  
    When booting up the kernel, fsck failed, and dropped me into a shell
-  to fix things up.  I ran fsck -y, which hung:
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
+  to fix things up.  I ran fsck -y, which hung::
  
  
+    Setting hostname uml                    [ OK ]
+    Checking root filesystem
+    /dev/fhd0 was not cleanly unmounted, check forced.
+    Error reading block 86894 (Attempt to read block from filesystem resulted in short read) while reading indirect blocks of inode 19780.
  
+    /dev/fhd0: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
+           (i.e., without -a or -p options)
+    [ FAILED ]
  
+    *** An error occurred during the file system check.
+    *** Dropping you to a shell; the system will reboot
+    *** when you leave the shell.
+    Give root password for maintenance
+    (or type Control-D for normal startup):
  
+    [root@uml /root]# fsck -y /dev/fhd0
+    fsck -y /dev/fhd0
+    Parallelizing fsck version 1.14 (9-Jan-1999)
+    e2fsck 1.14, 9-Jan-1999 for EXT2 FS 0.5b, 95/08/09
+    /dev/fhd0 contains a file system with errors, check forced.
+    Pass 1: Checking inodes, blocks, and sizes
+    Error reading block 86894 (Attempt to read block from filesystem resulted in short read) while reading indirect blocks of inode 19780.  Ignore error? yes
  
+    Inode 19780, i_blocks is 1548, should be 540.  Fix? yes
  
+    Pass 2: Checking directory structure
+    Error reading block 49405 (Attempt to read block from filesystem resulted in short read).  Ignore error? yes
  
+    Directory inode 11858, block 0, offset 0: directory corrupted
+    Salvage? yes
  
+    Missing '.' in directory inode 11858.
+    Fix? yes
  
-
-
-  Setting hostname uml                    [ OK ]
-  Checking root filesystem
-  /dev/fhd0 was not cleanly unmounted, check forced.
-  Error reading block 86894 (Attempt to read block from filesystem resulted in short read) while reading indirect blocks of inode 19780.
-
-  /dev/fhd0: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
-          (i.e., without -a or -p options)
-  [ FAILED ]
-
-  *** An error occurred during the file system check.
-  *** Dropping you to a shell; the system will reboot
-  *** when you leave the shell.
-  Give root password for maintenance
-  (or type Control-D for normal startup):
-
-  [root@uml /root]# fsck -y /dev/fhd0
-  fsck -y /dev/fhd0
-  Parallelizing fsck version 1.14 (9-Jan-1999)
-  e2fsck 1.14, 9-Jan-1999 for EXT2 FS 0.5b, 95/08/09
-  /dev/fhd0 contains a file system with errors, check forced.
-  Pass 1: Checking inodes, blocks, and sizes
-  Error reading block 86894 (Attempt to read block from filesystem resulted in short read) while reading indirect blocks of inode 19780.  Ignore error? yes
-
-  Inode 19780, i_blocks is 1548, should be 540.  Fix? yes
-
-  Pass 2: Checking directory structure
-  Error reading block 49405 (Attempt to read block from filesystem resulted in short read).  Ignore error? yes
-
-  Directory inode 11858, block 0, offset 0: directory corrupted
-  Salvage? yes
-
-  Missing '.' in directory inode 11858.
-  Fix? yes
-
-  Missing '..' in directory inode 11858.
-  Fix? yes
-
-
-
+    Missing '..' in directory inode 11858.
+    Fix? yes
  
  
    The standard drill in this sort of situation is to fire up gdb on the
    signal thread, which, in this case, was pid 1935.  In another window,
-  I run gdb and attach pid 1935.
-
-
+  I run gdb and attach pid 1935::
  
  
         ~/linux/2.3.26/um 1016: gdb linux
@@ -3022,11 +2900,7 @@
         0x100756d9 in __wait4 ()
  
  
-
-
-
-
-  Let's see what's currently running:
+  Let's see what's currently running::
  
  
  
@@ -3041,7 +2915,7 @@
    reason and never woke up.
  
  
-  Let's guess that the last process in the process list is fsck:
+  Let's guess that the last process in the process list is fsck::
  
  
  
@@ -3052,7 +2926,7 @@
  
  
  
-  It is, so let's see what it thinks it's up to:
+  It is, so let's see what it thinks it's up to::
  
  
  
@@ -3068,8 +2942,6 @@
  
  
  
-
-
    The interesting things here are the fact that its .thread.syscall.id
    is __NR_write (see the big switch in arch/um/kernel/syscall_kern.c or
    the defines in include/asm-um/arch/unistd.h), and that it never
@@ -3081,30 +2953,20 @@
    The fact that it never returned from write means that its stack should
    be fairly interesting.  Its pid is 1980 (.thread.extern_pid).  That
    process is being ptraced by the signal thread, so it must be detached
-  before gdb can attach it:
-
-
-
-
-
-
+  before gdb can attach it::
  
  
  
+    (gdb) call detach(1980)
  
-  (gdb) call detach(1980)
-
-  Program received signal SIGSEGV, Segmentation fault.
-  <function called from gdb>
-  The program being debugged stopped while in a function called from GDB.
-  When the function (detach) is done executing, GDB will silently
-  stop (instead of continuing to evaluate the expression containing
-  the function call).
-  (gdb) call detach(1980)
-  $15 = 0
-
-
-
+    Program received signal SIGSEGV, Segmentation fault.
+    <function called from gdb>
+    The program being debugged stopped while in a function called from GDB.
+    When the function (detach) is done executing, GDB will silently
+    stop (instead of continuing to evaluate the expression containing
+    the function call).
+    (gdb) call detach(1980)
+    $15 = 0
  
  
    The first detach segfaults for some reason, and the second one
@@ -3112,7 +2974,7 @@
  
  
    Now I detach from the signal thread, attach to the fsck thread, and
-  look at its stack:
+  look at its stack::
  
  
         (gdb) det
@@ -3152,14 +3014,14 @@
  
  
  
-  The interesting things here are :
+  The interesting things here are:
  
-  o  There are two segfaults on this stack (frames 9 and 14)
+  -  There are two segfaults on this stack (frames 9 and 14)
  
-  o  The first faulting address (frame 11) is 0x50000800
+  -  The first faulting address (frame 11) is 0x50000800::
  
-  (gdb) p (void *)1342179328
-  $16 = (void *) 0x50000800
+       (gdb) p (void *)1342179328
+       $16 = (void *) 0x50000800
  
  
  
@@ -3175,7 +3037,7 @@
  
    However, the more immediate problem is that second segfault and I'm
    going to concentrate on that.  First, I want to see where the fault
-  happened, so I have to go look at the sigcontent struct in frame 8:
+  happened, so I have to go look at the sigcontent struct in frame 8::
  
  
  
@@ -3211,7 +3073,7 @@
  
  
  
-  That's not very useful, so I'll try a more manual method:
+  That's not very useful, so I'll try a more manual method::
  
  
         (gdb) p *((struct sigcontext *) (&sig + 1))
@@ -3224,7 +3086,7 @@
  
  
  
-  The ip is in handle_mm_fault:
+  The ip is in handle_mm_fault::
  
  
         (gdb) p (void *)268480945
@@ -3236,7 +3098,7 @@
  
  
  
-  Specifically, it's in pte_alloc:
+  Specifically, it's in pte_alloc::
  
  
         (gdb) i line *$20
@@ -3249,7 +3111,7 @@
  
  
    To find where in handle_mm_fault this is, I'll jump forward in the
-  code until I see an address in that procedure:
+  code until I see an address in that procedure::
  
  
  
@@ -3286,21 +3148,21 @@
  
  
    Something is apparently wrong with the page tables or vma_structs, so
-  lets go back to frame 11 and have a look at them:
+  lets go back to frame 11 and have a look at them::
  
  
  
-  #11 0x1006c0aa in segv (address=1342179328, is_write=2) at trap_kern.c:50
-  50        handle_mm_fault(current, vma, address, is_write);
-  (gdb) call pgd_offset_proc(vma->vm_mm, address)
-  $22 = (pgd_t *) 0x80a548c
+    #11 0x1006c0aa in segv (address=1342179328, is_write=2) at trap_kern.c:50
+    50        handle_mm_fault(current, vma, address, is_write);
+    (gdb) call pgd_offset_proc(vma->vm_mm, address)
+    $22 = (pgd_t *) 0x80a548c
  
  
  
  
  
    That's pretty bogus.  Page tables aren't supposed to be in process
-  text or data areas.  Let's see what's in the vma:
+  text or data areas.  Let's see what's in the vma::
  
  
         (gdb) p *vma
@@ -3325,12 +3187,9 @@
  
  
  
-
-
    This also pretty bogus.  With all of the 0x80xxxxx and 0xaffffxxx
    addresses, this is looking like a stack was plonked down on top of
-  these structures.  Maybe it's a stack overflow from the next page:
-
+  these structures.  Maybe it's a stack overflow from the next page::
  
  
         (gdb) p vma
@@ -3338,52 +3197,36 @@
  
  
  
-
-
    That's towards the lower quarter of the page, so that would have to
-  have been pretty heavy stack overflow:
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-  (gdb) x/100x $25
-  0x507d2434:     0x507d2434      0x00000000      0x08048000      0x080a4f8c
-  0x507d2444:     0x00000000      0x080a79e0      0x080a8c94      0x080d1000
-  0x507d2454:     0xaffffdb0      0xaffffe63      0xaffffe7a      0xaffffe7a
-  0x507d2464:     0xafffffec      0x00000062      0x0000008a      0x00000000
-  0x507d2474:     0x00000000      0x00000000      0x00000000      0x00000000
-  0x507d2484:     0x00000000      0x00000000      0x00000000      0x00000000
-  0x507d2494:     0x00000000      0x00000000      0x507d2fe0      0x00000000
-  0x507d24a4:     0x00000000      0x00000000      0x00000000      0x00000000
-  0x507d24b4:     0x00000000      0x00000000      0x00000000      0x00000000
-  0x507d24c4:     0x00000000      0x00000000      0x00000000      0x00000000
-  0x507d24d4:     0x00000000      0x00000000      0x00000000      0x00000000
-  0x507d24e4:     0x00000000      0x00000000      0x00000000      0x00000000
-  0x507d24f4:     0x00000000      0x00000000      0x00000000      0x00000000
-  0x507d2504:     0x00000000      0x00000000      0x00000000      0x00000000
-  0x507d2514:     0x00000000      0x00000000      0x00000000      0x00000000
-  0x507d2524:     0x00000000      0x00000000      0x00000000      0x00000000
-  0x507d2534:     0x00000000      0x00000000      0x507d25dc      0x00000000
-  0x507d2544:     0x00000000      0x00000000      0x00000000      0x00000000
-  0x507d2554:     0x00000000      0x00000000      0x00000000      0x00000000
-  0x507d2564:     0x00000000      0x00000000      0x00000000      0x00000000
-  0x507d2574:     0x00000000      0x00000000      0x00000000      0x00000000
-  0x507d2584:     0x00000000      0x00000000      0x00000000      0x00000000
-  0x507d2594:     0x00000000      0x00000000      0x00000000      0x00000000
-  0x507d25a4:     0x00000000      0x00000000      0x00000000      0x00000000
-  0x507d25b4:     0x00000000      0x00000000      0x00000000      0x00000000
-
-
+  have been pretty heavy stack overflow::
+
+
+    (gdb) x/100x $25
+    0x507d2434:     0x507d2434      0x00000000      0x08048000      0x080a4f8c
+    0x507d2444:     0x00000000      0x080a79e0      0x080a8c94      0x080d1000
+    0x507d2454:     0xaffffdb0      0xaffffe63      0xaffffe7a      0xaffffe7a
+    0x507d2464:     0xafffffec      0x00000062      0x0000008a      0x00000000
+    0x507d2474:     0x00000000      0x00000000      0x00000000      0x00000000
+    0x507d2484:     0x00000000      0x00000000      0x00000000      0x00000000
+    0x507d2494:     0x00000000      0x00000000      0x507d2fe0      0x00000000
+    0x507d24a4:     0x00000000      0x00000000      0x00000000      0x00000000
+    0x507d24b4:     0x00000000      0x00000000      0x00000000      0x00000000
+    0x507d24c4:     0x00000000      0x00000000      0x00000000      0x00000000
+    0x507d24d4:     0x00000000      0x00000000      0x00000000      0x00000000
+    0x507d24e4:     0x00000000      0x00000000      0x00000000      0x00000000
+    0x507d24f4:     0x00000000      0x00000000      0x00000000      0x00000000
+    0x507d2504:     0x00000000      0x00000000      0x00000000      0x00000000
+    0x507d2514:     0x00000000      0x00000000      0x00000000      0x00000000
+    0x507d2524:     0x00000000      0x00000000      0x00000000      0x00000000
+    0x507d2534:     0x00000000      0x00000000      0x507d25dc      0x00000000
+    0x507d2544:     0x00000000      0x00000000      0x00000000      0x00000000
+    0x507d2554:     0x00000000      0x00000000      0x00000000      0x00000000
+    0x507d2564:     0x00000000      0x00000000      0x00000000      0x00000000
+    0x507d2574:     0x00000000      0x00000000      0x00000000      0x00000000
+    0x507d2584:     0x00000000      0x00000000      0x00000000      0x00000000
+    0x507d2594:     0x00000000      0x00000000      0x00000000      0x00000000
+    0x507d25a4:     0x00000000      0x00000000      0x00000000      0x00000000
+    0x507d25b4:     0x00000000      0x00000000      0x00000000      0x00000000
  
  
  
@@ -3399,65 +3242,53 @@
    on will be somewhat clearer.
  
  
-  12.2.  Episode 2: The case of the hung fsck
+12.2.  Episode 2: The case of the hung fsck
+-------------------------------------------
  
    After setting a trap in the SEGV handler for accesses to the signal
    thread's stack, I reran the kernel.
  
  
-  fsck hung again, this time by hitting the trap:
-
-
+  fsck hung again, this time by hitting the trap::
  
  
  
+    Setting hostname uml                            [ OK ]
+    Checking root filesystem
+    /dev/fhd0 contains a file system with errors, check forced.
+    Error reading block 86894 (Attempt to read block from filesystem resulted in short read) while reading indirect blocks of inode 19780.
  
+    /dev/fhd0: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
+           (i.e., without -a or -p options)
+    [ FAILED ]
  
+    *** An error occurred during the file system check.
+    *** Dropping you to a shell; the system will reboot
+    *** when you leave the shell.
+    Give root password for maintenance
+    (or type Control-D for normal startup):
  
+    [root@uml /root]# fsck -y /dev/fhd0
+    fsck -y /dev/fhd0
+    Parallelizing fsck version 1.14 (9-Jan-1999)
+    e2fsck 1.14, 9-Jan-1999 for EXT2 FS 0.5b, 95/08/09
+    /dev/fhd0 contains a file system with errors, check forced.
+    Pass 1: Checking inodes, blocks, and sizes
+    Error reading block 86894 (Attempt to read block from filesystem resulted in short read) while reading indirect blocks of inode 19780.  Ignore error? yes
  
+    Pass 2: Checking directory structure
+    Error reading block 49405 (Attempt to read block from filesystem resulted in short read).  Ignore error? yes
  
+    Directory inode 11858, block 0, offset 0: directory corrupted
+    Salvage? yes
  
+    Missing '.' in directory inode 11858.
+    Fix? yes
  
+    Missing '..' in directory inode 11858.
+    Fix? yes
  
-
-
-
-  Setting hostname uml                            [ OK ]
-  Checking root filesystem
-  /dev/fhd0 contains a file system with errors, check forced.
-  Error reading block 86894 (Attempt to read block from filesystem resulted in short read) while reading indirect blocks of inode 19780.
-
-  /dev/fhd0: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
-          (i.e., without -a or -p options)
-  [ FAILED ]
-
-  *** An error occurred during the file system check.
-  *** Dropping you to a shell; the system will reboot
-  *** when you leave the shell.
-  Give root password for maintenance
-  (or type Control-D for normal startup):
-
-  [root@uml /root]# fsck -y /dev/fhd0
-  fsck -y /dev/fhd0
-  Parallelizing fsck version 1.14 (9-Jan-1999)
-  e2fsck 1.14, 9-Jan-1999 for EXT2 FS 0.5b, 95/08/09
-  /dev/fhd0 contains a file system with errors, check forced.
-  Pass 1: Checking inodes, blocks, and sizes
-  Error reading block 86894 (Attempt to read block from filesystem resulted in short read) while reading indirect blocks of inode 19780.  Ignore error? yes
-
-  Pass 2: Checking directory structure
-  Error reading block 49405 (Attempt to read block from filesystem resulted in short read).  Ignore error? yes
-
-  Directory inode 11858, block 0, offset 0: directory corrupted
-  Salvage? yes
-
-  Missing '.' in directory inode 11858.
-  Fix? yes
-
-  Missing '..' in directory inode 11858.
-  Fix? yes
-
-  Untested (4127) [100fe44c]: trap_kern.c line 31
+    Untested (4127) [100fe44c]: trap_kern.c line 31
  
  
  
@@ -3465,7 +3296,7 @@
  
    I need to get the signal thread to detach from pid 4127 so that I can
    attach to it with gdb.  This is done by sending it a SIGUSR1, which is
-  caught by the signal thread, which detaches the process:
+  caught by the signal thread, which detaches the process::
  
  
         kill -USR1 4127
@@ -3474,31 +3305,20 @@
  
  
  
-  Now I can run gdb on it:
-
-
-
-
-
-
-
+  Now I can run gdb on it::
  
  
-
-
-
-
-  ~/linux/2.3.26/um 1034: gdb linux
-  GNU gdb 4.17.0.11 with Linux support
-  Copyright 1998 Free Software Foundation, Inc.
-  GDB is free software, covered by the GNU General Public License, and you are
-  welcome to change it and/or distribute copies of it under certain conditions.
-  Type "show copying" to see the conditions.
-  There is absolutely no warranty for GDB.  Type "show warranty" for details.
-  This GDB was configured as "i386-redhat-linux"...
-  (gdb) att 4127
-  Attaching to program `/home/dike/linux/2.3.26/um/linux', Pid 4127
-  0x10075891 in __libc_nanosleep ()
+    ~/linux/2.3.26/um 1034: gdb linux
+    GNU gdb 4.17.0.11 with Linux support
+    Copyright 1998 Free Software Foundation, Inc.
+    GDB is free software, covered by the GNU General Public License, and you are
+    welcome to change it and/or distribute copies of it under certain conditions.
+    Type "show copying" to see the conditions.
+    There is absolutely no warranty for GDB.  Type "show warranty" for details.
+    This GDB was configured as "i386-redhat-linux"...
+    (gdb) att 4127
+    Attaching to program `/home/dike/linux/2.3.26/um/linux', Pid 4127
+    0x10075891 in __libc_nanosleep ()
  
  
  
@@ -3506,7 +3326,7 @@
  
    The backtrace shows that it was in a write and that the fault address
    (address in frame 3) is 0x50000800, which is right in the middle of
-  the signal thread's stack page:
+  the signal thread's stack page::
  
  
         (gdb) bt
@@ -3540,58 +3360,48 @@
  
  
  
-
-
    Going up the stack to the segv_handler frame and looking at where in
    the code the access happened shows that it happened near line 110 of
-  block_dev.c:
-
-
-
-
-
-
-
-
-
-  (gdb) up
-  #1  0x1007584d in __sleep (seconds=1000000)
-      at ../sysdeps/unix/sysv/linux/sleep.c:78
-  ../sysdeps/unix/sysv/linux/sleep.c:78: No such file or directory.
-  (gdb)
-  #2  0x1006ce9a in stop () at user_util.c:191
-  191       while(1) sleep(1000000);
-  (gdb)
-  #3  0x1006bf88 in segv (address=1342179328, is_write=2) at trap_kern.c:31
-  31          KERN_UNTESTED();
-  (gdb)
-  #4  0x1006c628 in segv_handler (sc=0x5006eaf8) at trap_user.c:174
-  174       segv(sc->cr2, sc->err & 2);
-  (gdb) p *sc
-  $1 = {gs = 0, __gsh = 0, fs = 0, __fsh = 0, es = 43, __esh = 0, ds = 43,
-    __dsh = 0, edi = 1342179328, esi = 134973440, ebp = 1342631484,
-    esp = 1342630864, ebx = 256, edx = 0, ecx = 256, eax = 1024, trapno = 14,
-    err = 6, eip = 268550834, cs = 35, __csh = 0, eflags = 66070,
-    esp_at_signal = 1342630864, ss = 43, __ssh = 0, fpstate = 0x0, oldmask = 0,
-    cr2 = 1342179328}
-  (gdb) p (void *)268550834
-  $2 = (void *) 0x1001c2b2
-  (gdb) i sym $2
-  block_write + 1090 in section .text
-  (gdb) i line *$2
-  Line 209 of "/home/dike/linux/2.3.26/um/include/asm/arch/string.h"
-     starts at address 0x1001c2a1 <block_write+1073>
-     and ends at 0x1001c2bf <block_write+1103>.
-  (gdb) i line *0x1001c2c0
-  Line 110 of "block_dev.c" starts at address 0x1001c2bf <block_write+1103>
-     and ends at 0x1001c2e3 <block_write+1139>.
-
-
+  block_dev.c::
+
+
+
+    (gdb) up
+    #1  0x1007584d in __sleep (seconds=1000000)
+       at ../sysdeps/unix/sysv/linux/sleep.c:78
+    ../sysdeps/unix/sysv/linux/sleep.c:78: No such file or directory.
+    (gdb)
+    #2  0x1006ce9a in stop () at user_util.c:191
+    191       while(1) sleep(1000000);
+    (gdb)
+    #3  0x1006bf88 in segv (address=1342179328, is_write=2) at trap_kern.c:31
+    31          KERN_UNTESTED();
+    (gdb)
+    #4  0x1006c628 in segv_handler (sc=0x5006eaf8) at trap_user.c:174
+    174       segv(sc->cr2, sc->err & 2);
+    (gdb) p *sc
+    $1 = {gs = 0, __gsh = 0, fs = 0, __fsh = 0, es = 43, __esh = 0, ds = 43,
+       __dsh = 0, edi = 1342179328, esi = 134973440, ebp = 1342631484,
+       esp = 1342630864, ebx = 256, edx = 0, ecx = 256, eax = 1024, trapno = 14,
+       err = 6, eip = 268550834, cs = 35, __csh = 0, eflags = 66070,
+       esp_at_signal = 1342630864, ss = 43, __ssh = 0, fpstate = 0x0, oldmask = 0,
+       cr2 = 1342179328}
+    (gdb) p (void *)268550834
+    $2 = (void *) 0x1001c2b2
+    (gdb) i sym $2
+    block_write + 1090 in section .text
+    (gdb) i line *$2
+    Line 209 of "/home/dike/linux/2.3.26/um/include/asm/arch/string.h"
+       starts at address 0x1001c2a1 <block_write+1073>
+       and ends at 0x1001c2bf <block_write+1103>.
+    (gdb) i line *0x1001c2c0
+    Line 110 of "block_dev.c" starts at address 0x1001c2bf <block_write+1103>
+       and ends at 0x1001c2e3 <block_write+1139>.
  
  
  
    Looking at the source shows that the fault happened during a call to
-  copy_from_user to copy the data into the kernel:
+  copy_from_user to copy the data into the kernel::
  
  
         107             count -= chars;
@@ -3601,10 +3411,8 @@
  
  
  
-
-
    p is the pointer which must contain 0x50000800, since buf contains
-  0x80b8800 (frame 8 above).  It is defined as:
+  0x80b8800 (frame 8 above).  It is defined as::
  
  
                         p = offset + bh->b_data;
@@ -3615,24 +3423,22 @@
  
    I need to figure out what bh is, and it just so happens that bh is
    passed as an argument to mark_buffer_uptodate and mark_buffer_dirty a
-  few lines later, so I do a little disassembly:
-
-
+  few lines later, so I do a little disassembly::
  
  
-  (gdb) disas 0x1001c2bf 0x1001c2e0
-  Dump of assembler code from 0x1001c2bf to 0x1001c2d0:
-  0x1001c2bf <block_write+1103>:  addl   %eax,0xc(%ebp)
-  0x1001c2c2 <block_write+1106>:  movl   0xfffffdd4(%ebp),%edx
-  0x1001c2c8 <block_write+1112>:  btsl   $0x0,0x18(%edx)
-  0x1001c2cd <block_write+1117>:  btsl   $0x1,0x18(%edx)
-  0x1001c2d2 <block_write+1122>:  sbbl   %ecx,%ecx
-  0x1001c2d4 <block_write+1124>:  testl  %ecx,%ecx
-  0x1001c2d6 <block_write+1126>:  jne    0x1001c2e3 <block_write+1139>
-  0x1001c2d8 <block_write+1128>:  pushl  $0x0
-  0x1001c2da <block_write+1130>:  pushl  %edx
-  0x1001c2db <block_write+1131>:  call   0x1001819c <__mark_buffer_dirty>
-  End of assembler dump.
+    (gdb) disas 0x1001c2bf 0x1001c2e0
+    Dump of assembler code from 0x1001c2bf to 0x1001c2d0:
+    0x1001c2bf <block_write+1103>:  addl   %eax,0xc(%ebp)
+    0x1001c2c2 <block_write+1106>:  movl   0xfffffdd4(%ebp),%edx
+    0x1001c2c8 <block_write+1112>:  btsl   $0x0,0x18(%edx)
+    0x1001c2cd <block_write+1117>:  btsl   $0x1,0x18(%edx)
+    0x1001c2d2 <block_write+1122>:  sbbl   %ecx,%ecx
+    0x1001c2d4 <block_write+1124>:  testl  %ecx,%ecx
+    0x1001c2d6 <block_write+1126>:  jne    0x1001c2e3 <block_write+1139>
+    0x1001c2d8 <block_write+1128>:  pushl  $0x0
+    0x1001c2da <block_write+1130>:  pushl  %edx
+    0x1001c2db <block_write+1131>:  call   0x1001819c <__mark_buffer_dirty>
+    End of assembler dump.
  
  
  
@@ -3640,7 +3446,7 @@
  
    At that point, bh is in %edx (address 0x1001c2da), which is calculated
    at 0x1001c2c2 as %ebp + 0xfffffdd4, so I figure exactly what that is,
-  taking %ebp from the sigcontext_struct above:
+  taking %ebp from the sigcontext_struct above::
  
  
         (gdb) p (void *)1342631484
@@ -3657,7 +3463,7 @@
  
  
    Now, I look at the structure to see what's in it, and particularly,
-  what its b_data field contains:
+  what its b_data field contains::
  
  
         (gdb) p *((struct buffer_head *)0x50100200)
@@ -3682,18 +3488,18 @@
  
    The b_page field is a pointer to the page_struct representing the
    0x50000000 page.  Looking at it shows the kernel's idea of the state
-  of that page:
+  of that page::
  
  
  
-  (gdb) p *$13.b_page
-  $17 = {list = {next = 0x50004a5c, prev = 0x100c5174}, mapping = 0x0,
-    index = 0, next_hash = 0x0, count = {counter = 1}, flags = 132, lru = {
-      next = 0x50008460, prev = 0x50019350}, wait = {
-      lock = <optimized out or zero length>, task_list = {next = 0x50004024,
-        prev = 0x50004024}, __magic = 1342193708, __creator = 0},
-    pprev_hash = 0x0, buffers = 0x501002c0, virtual = 1342177280,
-    zone = 0x100c5160}
+    (gdb) p *$13.b_page
+    $17 = {list = {next = 0x50004a5c, prev = 0x100c5174}, mapping = 0x0,
+       index = 0, next_hash = 0x0, count = {counter = 1}, flags = 132, lru = {
+       next = 0x50008460, prev = 0x50019350}, wait = {
+       lock = <optimized out or zero length>, task_list = {next = 0x50004024,
+           prev = 0x50004024}, __magic = 1342193708, __creator = 0},
+       pprev_hash = 0x0, buffers = 0x501002c0, virtual = 1342177280,
+       zone = 0x100c5160}
  
  
  
@@ -3702,7 +3508,7 @@
    Some sanity-checking: the virtual field shows the "virtual" address of
    this page, which in this kernel is the same as its "physical" address,
    and the page_struct itself should be mem_map[0], since it represents
-  the first page of memory:
+  the first page of memory::
  
  
  
@@ -3719,7 +3525,7 @@
  
  
    Now to check out the page_struct itself.  In particular, the flags
-  field shows whether the page is considered free or not:
+  field shows whether the page is considered free or not::
  
  
         (gdb) p (void *)132
@@ -3739,7 +3545,7 @@
  
  
    In my setup_arch procedure, I have the following code which looks just
-  fine:
+  fine::
  
  
  
@@ -3762,7 +3568,7 @@
  
  
    Stepping into init_bootmem, and looking at bootmem_map before looking
-  at what it contains shows the following:
+  at what it contains shows the following::
  
  
  
@@ -3788,18 +3594,20 @@
  
  
  
-  13.  What to do when UML doesn't work
+13.  What to do when UML doesn't work
+=====================================
  
  
  
  
-  13.1.  Strange compilation errors when you build from source
+13.1.  Strange compilation errors when you build from source
+------------------------------------------------------------
  
    As of test11, it is necessary to have "ARCH=um" in the environment or
    on the make command line for all steps in building UML, including
    clean, distclean, or mrproper, config, menuconfig, or xconfig, dep,
    and linux.  If you forget for any of them, the i386 build seems to
-  contaminate the UML build.  If this happens, start from scratch with
+  contaminate the UML build.  If this happens, start from scratch with::
  
  
         host%
@@ -3811,7 +3619,7 @@
    and repeat the build process with ARCH=um on all the steps.
  
  
-  See ``Compiling the kernel and modules''  for more details.
+  See :ref:`Compiling_the_kernel_and_modules`  for more details.
  
  
    Another cause of strange compilation errors is building UML in
@@ -3824,11 +3632,11 @@
  
  
  
-  13.3.  A variety of panics and hangs with /tmp on a reiserfs  filesys-
-  tem
+13.3.  A variety of panics and hangs with /tmp on a reiserfs filesystem
+-----------------------------------------------------------------------
  
    I saw this on reiserfs 3.5.21 and it seems to be fixed in 3.5.27.
-  Panics preceded by
+  Panics preceded by::
  
  
         Detaching pid nnnn
@@ -3854,17 +3662,19 @@
  
  
  
-  13.5.  UML doesn't work when /tmp is an NFS filesystem
+13.5.  UML doesn't work when /tmp is an NFS filesystem
+------------------------------------------------------
  
    This seems to be a similar situation with the ReiserFS problem above.
    Some versions of NFS seems not to handle mmap correctly, which UML
    depends on.  The workaround is have /tmp be a non-NFS directory.
  
  
-  13.6.  UML hangs on boot when compiled with gprof support
+13.6.  UML hangs on boot when compiled with gprof support
+---------------------------------------------------------
  
    If you build UML with gprof support and, early in the boot, it does
-  this
+  this::
  
  
         kernel BUG at page_alloc.c:100!
@@ -3878,10 +3688,11 @@
  
  
  
-  13.7.  syslogd dies with a SIGTERM on startup
+13.7.  syslogd dies with a SIGTERM on startup
+---------------------------------------------
  
    The exact boot error depends on the distribution that you're booting,
-  but Debian produces this:
+  but Debian produces this::
  
  
         /etc/rc2.d/S10sysklogd: line 49:    93 Terminated
@@ -3891,23 +3702,21 @@
  
  
    This is a syslogd bug.  There's a race between a parent process
-  installing a signal handler and its child sending the signal.  See
-  this uml-devel post <http://www.geocrawler.com/lists/3/Source-
-  Forge/709/0/6612801>  for the details.
+  installing a signal handler and its child sending the signal.
  
  
  
-  13.8.  TUN/TAP networking doesn't work on a 2.4 host
+13.8.  TUN/TAP networking doesn't work on a 2.4 host
+----------------------------------------------------
  
-  There are a couple of problems which were
-  <http://www.geocrawler.com/lists/3/SourceForge/597/0/> name="pointed
-  out">  by Tim Robinson <timro at trkr dot net>
+  There are a couple of problems which were reported by
+  Tim Robinson <timro at trkr dot net>
  
-  o  It doesn't work on hosts running 2.4.7 (or thereabouts) or earlier.
+  -  It doesn't work on hosts running 2.4.7 (or thereabouts) or earlier.
       The fix is to upgrade to something more recent and then read the
       next item.
  
-  o  If you see
+  -  If you see::
  
  
         File descriptor in bad state
@@ -3921,8 +3730,8 @@
  
  
  
-  13.9.  You can network to the host but not to other machines on the
-  net
+13.9.  You can network to the host but not to other machines on the net
+=======================================================================
  
    If you can connect to the host, and the host can connect to UML, but
    you cannot connect to any other machines, then you may need to enable
@@ -3930,7 +3739,7 @@
    using private IP addresses (192.168.x.x or 10.x.x.x) for host/UML
    networking, rather than the public address space that your host is
    connected to.  UML does not enable IP Masquerading, so you will need
-  to create a static rule to enable it:
+  to create a static rule to enable it::
  
  
         host%
@@ -3944,11 +3753,11 @@
  
  
    Documentation on IP Masquerading, and SNAT, can be found at
-  www.netfilter.org  <http://www.netfilter.org> .
+  http://www.netfilter.org.
  
  
    If you can reach the local net, but not the outside Internet, then
-  that is usually a routing problem.  The UML needs a default route:
+  that is usually a routing problem.  The UML needs a default route::
  
  
         UML#
@@ -3972,7 +3781,8 @@
  
  
  
-  13.10.  I have no root and I want to scream
+13.10.  I have no root and I want to scream
+===========================================
  
    Thanks to Birgit Wahlich for telling me about this strange one.  It
    turns out that there's a limit of six environment variables on the
@@ -3987,14 +3797,16 @@
  
  
  
-  13.11.  UML build conflict between ptrace.h and ucontext.h
+13.11.  UML build conflict between ptrace.h and ucontext.h
+==========================================================
  
    On some older systems, /usr/include/asm/ptrace.h and
    /usr/include/sys/ucontext.h define the same names.  So, when they're
    included together, the defines from one completely mess up the parsing
-  of the other, producing errors like:
+  of the other, producing errors like::
+
         /usr/include/sys/ucontext.h:47: parse error before
-       `10'
+       `10`
  
  
  
@@ -4007,7 +3819,8 @@
  
  
  
-  13.12.  The UML BogoMips is exactly half the host's BogoMips
+13.12.  The UML BogoMips is exactly half the host's BogoMips
+------------------------------------------------------------
  
    On i386 kernels, there are two ways of running the loop that is used
    to calculate the BogoMips rating, using the TSC if it's there or using
@@ -4019,15 +3832,17 @@
  
  
  
-  13.13.  When you run UML, it immediately segfaults
+13.13.  When you run UML, it immediately segfaults
+--------------------------------------------------
  
    If the host is configured with the 2G/2G address space split, that's
-  why.  See ``UML on 2G/2G hosts''  for the details on getting UML to
+  why.  See ref:`UML_on_2G/2G_hosts`  for the details on getting UML to
    run on your host.
  
  
  
-  13.14.  xterms appear, then immediately disappear
+13.14.  xterms appear, then immediately disappear
+-------------------------------------------------
  
    If you're running an up to date kernel with an old release of
    uml_utilities, the port-helper program will not work properly, so
@@ -4039,7 +3854,8 @@
  
  
  
-  13.15.  Any other panic, hang, or strange behavior
+13.15.  Any other panic, hang, or strange behavior
+--------------------------------------------------
  
    If you're seeing truly strange behavior, such as hangs or panics that
    happen in random places, or you try running the debugger to see what's
@@ -4057,9 +3873,13 @@
    it and that a fix is imminent.
  
  
-  If you want to be super-helpful, read ``Diagnosing Problems'' and
+  If you want to be super-helpful, read :ref:`Diagnosing_Problems` and
    follow the instructions contained therein.
-  14.  Diagnosing Problems
+
+.. _Diagnosing_Problems:
+
+14.  Diagnosing Problems
+========================
  
  
    If you get UML to crash, hang, or otherwise misbehave, you should
@@ -4074,21 +3894,22 @@
  
    For any diagnosis, you're going to need to build a debugging kernel.
    The binaries from this site aren't debuggable.  If you haven't done
-  this before, read about ``Compiling the kernel and modules''  and
-  ``Kernel debugging''  UML first.
+  this before, read about :ref:`Compiling_the_kernel_and_modules`  and
+  :ref:`Kernel_debugging` UML first.
  
  
-  14.1.  Case 1 : Normal kernel panics
+14.1.  Case 1 : Normal kernel panics
+------------------------------------
  
    The most common case is for a normal thread to panic.  To debug this,
    you will need to run it under the debugger (add 'debug' to the command
    line).  An xterm will start up with gdb running inside it.  Continue
-  it when it stops in start_kernel and make it crash.  Now ^C gdb and
+  it when it stops in start_kernel and make it crash.  Now ``^C gdb`` and
  
  
    If the panic was a "Kernel mode fault", then there will be a segv
    frame on the stack and I'm going to want some more information.  The
-  stack might look something like this:
+  stack might look something like this::
  
  
         (UML gdb)  backtrace
@@ -4107,7 +3928,7 @@
  
  
    I'm going to want to see the symbol and line information for the value
-  of ip in the segv frame.  In this case, you would do the following:
+  of ip in the segv frame.  In this case, you would do the following::
  
  
         (UML gdb)  i sym 268849158
@@ -4115,7 +3936,7 @@
  
  
  
-  and
+  and::
  
  
         (UML gdb)  i line *268849158
@@ -4128,7 +3949,8 @@
    to get that information from the faulting ip.
  
  
-  14.2.  Case 2 : Tracing thread panics
+14.2.  Case 2 : Tracing thread panics
+-------------------------------------
  
    The less common and more painful case is when the tracing thread
    panics.  In this case, the kernel debugger will be useless because it
@@ -4136,7 +3958,7 @@
    do is get a backtrace from the tracing thread.  This is done by
    figuring out what its pid is, firing up gdb, and attaching it to that
    pid.  You can figure out the tracing thread pid by looking at the
-  first line of the console output, which will look like this:
+  first line of the console output, which will look like this::
  
  
         tracing thread pid = 15851
@@ -4145,7 +3967,7 @@
  
  
    or by running ps on the host and finding the line that looks like
-  this:
+  this::
  
  
         jdike 15851 4.5 0.4 132568 1104 pts/0 S 21:34 0:05 ./linux [(tracing thread)]
@@ -4164,7 +3986,7 @@
    14.3.  Case 3 : Tracing thread panics caused by other threads
  
    However, there are cases where the misbehavior of another thread
-  caused the problem.  The most common panic of this type is:
+  caused the problem.  The most common panic of this type is::
  
  
         wait_for_stop failed to wait for  <pid>  to stop with  <signal number>
@@ -4177,7 +3999,7 @@
    debugger is defunct and without some fancy footwork, another gdb can't
    attach to it.  So, this is how the fancy footwork goes:
  
-  In a shell:
+  In a shell::
  
  
         host% kill -STOP pid
@@ -4185,7 +4007,7 @@
  
  
  
-  Run gdb on the tracing thread as described in case 2 and do:
+  Run gdb on the tracing thread as described in case 2 and do::
  
  
         (host gdb)  call detach(pid)
@@ -4193,7 +4015,7 @@
  
    If you get a segfault, do it again.  It always works the second time.
  
-  Detach from the tracing thread and attach to that other thread:
+  Detach from the tracing thread and attach to that other thread::
  
  
         (host gdb)  detach
@@ -4209,7 +4031,7 @@
  
  
    If gdb hangs when attaching to that process, go back to a shell and
-  do:
+  do::
  
  
         host%
@@ -4218,7 +4040,7 @@
  
  
  
-  And then get the backtrace:
+  And then get the backtrace::
  
  
         (host gdb)  backtrace
@@ -4227,13 +4049,14 @@
  
  
  
-  14.4.  Case 4 : Hangs
+14.4.  Case 4 : Hangs
+---------------------
  
    Hangs seem to be fairly rare, but they sometimes happen.  When a hang
    happens, we need a backtrace from the offending process.  Run the
    kernel debugger as described in case 1 and get a backtrace.  If the
    current process is not the idle thread, then send in the backtrace.
-  You can tell that it's the idle thread if the stack looks like this:
+  You can tell that it's the idle thread if the stack looks like this::
  
  
         #0  0x100b1401 in __libc_nanosleep ()
@@ -4257,7 +4080,8 @@
  
  
  
-  15.  Thanks
+15.  Thanks
+===========
  
  
    A number of people have helped this project in various ways, and this
@@ -4274,20 +4098,21 @@
    bookkeeping lapses and I forget about contributions.
  
  
-  15.1.  Code and Documentation
+15.1.  Code and Documentation
+-----------------------------
  
    Rusty Russell <rusty at linuxcare.com.au>  -
  
-  o  wrote the  HOWTO <http://user-mode-
-     linux.sourceforge.net/UserModeLinux-HOWTO.html>
+  -  wrote the  HOWTO
+     http://user-mode-linux.sourceforge.net/old/UserModeLinux-HOWTO.html
  
-  o  prodded me into making this project official and putting it on
+  -  prodded me into making this project official and putting it on
       SourceForge
  
-  o  came up with the way cool UML logo <http://user-mode-
-     linux.sourceforge.net/uml-small.png>
+  -  came up with the way cool UML logo
+     http://user-mode-linux.sourceforge.net/uml-small.png
  
-  o  redid the config process
+  -  redid the config process
  
  
    Peter Moulder <reiter at netspace.net.au>  - Fixed my config and build
@@ -4296,34 +4121,32 @@
  
    Bill Stearns <wstearns at pobox.com>  -
  
-  o  HOWTO updates
+  -  HOWTO updates
  
-  o  lots of bug reports
+  -  lots of bug reports
  
-  o  lots of testing
+  -  lots of testing
  
-  o  dedicated a box (uml.ists.dartmouth.edu) to support UML development
+  -  dedicated a box (uml.ists.dartmouth.edu) to support UML development
  
-  o  wrote the mkrootfs script, which allows bootable filesystems of
+  -  wrote the mkrootfs script, which allows bootable filesystems of
       RPM-based distributions to be cranked out
  
-  o  cranked out a large number of filesystems with said script
+  -  cranked out a large number of filesystems with said script
  
  
    Jim Leu <jleu at mindspring.com>  - Wrote the virtual ethernet driver
    and associated usermode tools
  
-  Lars Brinkhoff <http://lars.nocrew.org/>  - Contributed the ptrace
-  proxy from his own  project <http://a386.nocrew.org/> to allow easier
-  kernel debugging
+  Lars Brinkhoff http://lars.nocrew.org/  - Contributed the ptrace
+  proxy from his own  project to allow easier kernel debugging
  
  
    Andrea Arcangeli <andrea at suse.de>  - Redid some of the early boot
    code so that it would work on machines with Large File Support
  
  
-  Chris Emerson <http://www.chiark.greenend.org.uk/~cemerson/>  - Did
-  the first UML port to Linux/ppc
+  Chris Emerson - Did the first UML port to Linux/ppc
  
  
    Harald Welte <laforge at gnumonks.org>  - Wrote the multicast
@@ -4338,7 +4161,7 @@
    wrote the iomem emulation support
  
  
-  Henrik Nordstrom <http://hem.passagen.se/hno/>  - Provided a variety
+  Henrik Nordstrom http://hem.passagen.se/hno/  - Provided a variety
    of patches, fixes, and clues
  
  
@@ -4373,190 +4196,193 @@
    submitted patches for the slip transport and lots of other things.
  
  
-  David Coulson <http://davidcoulson.net>  -
+  David Coulson http://davidcoulson.net  -
  
-  o  Set up the usermodelinux.org <http://usermodelinux.org>  site,
+  -  Set up the http://usermodelinux.org  site,
       which is a great way of keeping the UML user community on top of
       UML goings-on.
  
-  o  Site documentation and updates
+  -  Site documentation and updates
  
-  o  Nifty little UML management daemon  UMLd
-     <http://uml.openconsultancy.com/umld/>
+  -  Nifty little UML management daemon  UMLd
  
-  o  Lots of testing and bug reports
+  -  Lots of testing and bug reports
  
  
  
  
-  15.2.  Flushing out bugs
+15.2.  Flushing out bugs
+------------------------
  
  
  
-  o  Yuri Pudgorodsky
+  -  Yuri Pudgorodsky
  
-  o  Gerald Britton
+  -  Gerald Britton
  
-  o  Ian Wehrman
+  -  Ian Wehrman
  
-  o  Gord Lamb
+  -  Gord Lamb
  
-  o  Eugene Koontz
+  -  Eugene Koontz
  
-  o  John H. Hartman
+  -  John H. Hartman
  
-  o  Anders Karlsson
+  -  Anders Karlsson
  
-  o  Daniel Phillips
+  -  Daniel Phillips
  
-  o  John Fremlin
+  -  John Fremlin
  
-  o  Rainer Burgstaller
+  -  Rainer Burgstaller
  
-  o  James Stevenson
+  -  James Stevenson
  
-  o  Matt Clay
+  -  Matt Clay
  
-  o  Cliff Jefferies
+  -  Cliff Jefferies
  
-  o  Geoff Hoff
+  -  Geoff Hoff
  
-  o  Lennert Buytenhek
+  -  Lennert Buytenhek
  
-  o  Al Viro
+  -  Al Viro
  
-  o  Frank Klingenhoefer
+  -  Frank Klingenhoefer
  
-  o  Livio Baldini Soares
+  -  Livio Baldini Soares
  
-  o  Jon Burgess
+  -  Jon Burgess
  
-  o  Petru Paler
+  -  Petru Paler
  
-  o  Paul
+  -  Paul
  
-  o  Chris Reahard
+  -  Chris Reahard
  
-  o  Sverker Nilsson
+  -  Sverker Nilsson
  
-  o  Gong Su
+  -  Gong Su
  
-  o  johan verrept
+  -  johan verrept
  
-  o  Bjorn Eriksson
+  -  Bjorn Eriksson
  
-  o  Lorenzo Allegrucci
+  -  Lorenzo Allegrucci
  
-  o  Muli Ben-Yehuda
+  -  Muli Ben-Yehuda
  
-  o  David Mansfield
+  -  David Mansfield
  
-  o  Howard Goff
+  -  Howard Goff
  
-  o  Mike Anderson
+  -  Mike Anderson
  
-  o  John Byrne
+  -  John Byrne
  
-  o  Sapan J. Batia
+  -  Sapan J. Batia
  
-  o  Iris Huang
+  -  Iris Huang
  
-  o  Jan Hudec
+  -  Jan Hudec
  
-  o  Voluspa
+  -  Voluspa
  
  
  
  
-  15.3.  Buglets and clean-ups
+15.3.  Buglets and clean-ups
+----------------------------
  
  
  
-  o  Dave Zarzycki
+  -  Dave Zarzycki
  
-  o  Adam Lazur
+  -  Adam Lazur
  
-  o  Boria Feigin
+  -  Boria Feigin
  
-  o  Brian J. Murrell
+  -  Brian J. Murrell
  
-  o  JS
+  -  JS
  
-  o  Roman Zippel
+  -  Roman Zippel
  
-  o  Wil Cooley
+  -  Wil Cooley
  
-  o  Ayelet Shemesh
+  -  Ayelet Shemesh
  
-  o  Will Dyson
+  -  Will Dyson
  
-  o  Sverker Nilsson
+  -  Sverker Nilsson
  
-  o  dvorak
+  -  dvorak
  
-  o  v.naga srinivas
+  -  v.naga srinivas
  
-  o  Shlomi Fish
+  -  Shlomi Fish
  
-  o  Roger Binns
+  -  Roger Binns
  
-  o  johan verrept
+  -  johan verrept
  
-  o  MrChuoi
+  -  MrChuoi
  
-  o  Peter Cleve
+  -  Peter Cleve
  
-  o  Vincent Guffens
+  -  Vincent Guffens
  
-  o  Nathan Scott
+  -  Nathan Scott
  
-  o  Patrick Caulfield
+  -  Patrick Caulfield
  
-  o  jbearce
+  -  jbearce
  
-  o  Catalin Marinas
+  -  Catalin Marinas
  
-  o  Shane Spencer
+  -  Shane Spencer
  
-  o  Zou Min
+  -  Zou Min
  
  
-  o  Ryan Boder
+  -  Ryan Boder
  
-  o  Lorenzo Colitti
+  -  Lorenzo Colitti
  
-  o  Gwendal Grignou
+  -  Gwendal Grignou
  
-  o  Andre' Breiler
+  -  Andre' Breiler
  
-  o  Tsutomu Yasuda
+  -  Tsutomu Yasuda
  
  
  
-  15.4.  Case Studies
+15.4.  Case Studies
+-------------------
  
  
-  o  Jon Wright
+  -  Jon Wright
  
-  o  William McEwan
+  -  William McEwan
  
-  o  Michael Richardson
+  -  Michael Richardson
  
  
  
-  15.5.  Other contributions
+15.5.  Other contributions
+--------------------------
  
  
    Bill Carr <Bill.Carr at compaq.com>  made the Red Hat mkrootfs script
    work with RH 6.2.
  
    Michael Jennings <mikejen at hevanet.com>  sent in some material which
-  is now gracing the top of the  index  page <http://user-mode-
-  linux.sourceforge.net/>  of this site.
+  is now gracing the top of the  index  page
+  http://user-mode-linux.sourceforge.net/  of this site.
  
-  SGI <http://www.sgi.com>  (and more specifically Ralf Baechle <ralf at
-  uni-koblenz.de> ) gave me an account on oss.sgi.com
-  <http://www.oss.sgi.com> .  The bandwidth there made it possible to
+  SGI (and more specifically Ralf Baechle <ralf at
+  uni-koblenz.de> ) gave me an account on oss.sgi.com.
+  The bandwidth there made it possible to
    produce most of the filesystems available on the project download
    page.
  
@@ -4573,17 +4399,5 @@
  
    Chris Reahard built a specialized root filesystem for running a DNS
    server jailed inside UML.  It's available from the download
-  <http://user-mode-linux.sourceforge.net/dl-sf.html>  page in the Jail
+  http://user-mode-linux.sourceforge.net/old/dl-sf.html  page in the Jail
    Filesystems section.
-
-
-
-
-
-
-
-
-
-
-
-
diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst

index a8de2fbc1caad73319ee40b2c16a2ec83642ea7e..265d9e9a093b8d3810b2148b3f794dfa58c0b473 100644 (file)
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -19,7 +19,6 @@ x86-specific Documentation
     tlb
     mtrr
     pat
-   intel_mpx
     intel-iommu
     intel_txt
     amd-memory-encryption
diff --git a/MAINTAINERS b/MAINTAINERS

index 38fe2f3f7b6f290e67168db75a1881cfbcc5be8b..6158a143a13e075c62621ed2cf247de645f4d0d0 100644 (file)
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2796,11 +2796,11 @@ F:      drivers/block/aoe/
  
  ATHEROS 71XX/9XXX GPIO DRIVER
  M:     Alban Bedel <albeu@free.fr>
+S:     Maintained
  W:     https://github.com/AlbanBedel/linux
  T:     git git://github.com/AlbanBedel/linux
-S:     Maintained
-F:     drivers/gpio/gpio-ath79.c
  F:     Documentation/devicetree/bindings/gpio/gpio-ath79.txt
+F:     drivers/gpio/gpio-ath79.c
  
  ATHEROS 71XX/9XXX USB PHY DRIVER
  M:     Alban Bedel <albeu@free.fr>
@@ -3422,8 +3422,8 @@ BROADCOM BRCMSTB GPIO DRIVER
  M:     Gregory Fong <gregory.0xf0@gmail.com>
  L:     bcm-kernel-feedback-list@broadcom.com
  S:     Supported
-F:     drivers/gpio/gpio-brcmstb.c
  F:     Documentation/devicetree/bindings/gpio/brcm,brcmstb-gpio.txt
+F:     drivers/gpio/gpio-brcmstb.c
  
  BROADCOM BRCMSTB I2C DRIVER
  M:     Kamal Dasu <kdasu.kdev@gmail.com>
@@ -3481,8 +3481,8 @@ BROADCOM KONA GPIO DRIVER
  M:     Ray Jui <rjui@broadcom.com>
  L:     bcm-kernel-feedback-list@broadcom.com
  S:     Supported
-F:     drivers/gpio/gpio-bcm-kona.c
  F:     Documentation/devicetree/bindings/gpio/brcm,kona-gpio.txt
+F:     drivers/gpio/gpio-bcm-kona.c
  
  BROADCOM NETXTREME-E ROCE DRIVER
  M:     Selvin Xavier <selvin.xavier@broadcom.com>
@@ -3597,8 +3597,8 @@ F:        sound/pci/bt87x.c
  
  BT8XXGPIO DRIVER
  M:     Michael Buesch <m@bues.ch>
-W:     http://bu3sch.de/btgpio.php
  S:     Maintained
+W:     http://bu3sch.de/btgpio.php
  F:     drivers/gpio/gpio-bt8xx.c
  
  BTRFS FILE SYSTEM
@@ -3649,6 +3649,7 @@ F:        sound/pci/oxygen/
  
  C-SKY ARCHITECTURE
  M:     Guo Ren <guoren@kernel.org>
+L:     linux-csky@vger.kernel.org
  T:     git https://github.com/c-sky/csky-linux.git
  S:     Supported
  F:     arch/csky/
@@ -3909,7 +3910,7 @@ S:        Supported
  F:     Documentation/filesystems/ceph.txt
  F:     fs/ceph/
  
-CERTIFICATE HANDLING:
+CERTIFICATE HANDLING
  M:     David Howells <dhowells@redhat.com>
  M:     David Woodhouse <dwmw2@infradead.org>
  L:     keyrings@vger.kernel.org
@@ -3919,7 +3920,7 @@ F:        certs/
  F:     scripts/sign-file.c
  F:     scripts/extract-cert.c
  
-CERTIFIED WIRELESS USB (WUSB) SUBSYSTEM:
+CERTIFIED WIRELESS USB (WUSB) SUBSYSTEM
  L:     devel@driverdev.osuosl.org
  S:     Obsolete
  F:     drivers/staging/wusbcore/
@@ -5932,12 +5933,12 @@ S:      Maintained
  F:     drivers/media/dvb-frontends/ec100*
  
  ECRYPT FILE SYSTEM
-M:     Tyler Hicks <tyhicks@canonical.com>
+M:     Tyler Hicks <code@tyhicks.com>
  L:     ecryptfs@vger.kernel.org
  W:     http://ecryptfs.org
  W:     https://launchpad.net/ecryptfs
  T:     git git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs.git
-S:     Supported
+S:     Odd Fixes
  F:     Documentation/filesystems/ecryptfs.txt
  F:     fs/ecryptfs/
  
@@ -7047,7 +7048,7 @@ L:        kvm@vger.kernel.org
  S:     Supported
  F:     drivers/uio/uio_pci_generic.c
  
-GENERIC VDSO LIBRARY:
+GENERIC VDSO LIBRARY
  M:     Andy Lutomirski <luto@kernel.org>
  M:     Thomas Gleixner <tglx@linutronix.de>
  M:     Vincenzo Frascino <vincenzo.frascino@arm.com>
@@ -7143,18 +7144,18 @@ GPIO SUBSYSTEM
  M:     Linus Walleij <linus.walleij@linaro.org>
  M:     Bartosz Golaszewski <bgolaszewski@baylibre.com>
  L:     linux-gpio@vger.kernel.org
-T:     git git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio.git
  S:     Maintained
+T:     git git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio.git
+F:     Documentation/ABI/obsolete/sysfs-gpio
+F:     Documentation/ABI/testing/gpio-cdev
+F:     Documentation/admin-guide/gpio/
  F:     Documentation/devicetree/bindings/gpio/
  F:     Documentation/driver-api/gpio/
-F:     Documentation/admin-guide/gpio/
-F:     Documentation/ABI/testing/gpio-cdev
-F:     Documentation/ABI/obsolete/sysfs-gpio
  F:     drivers/gpio/
+F:     include/asm-generic/gpio.h
  F:     include/linux/gpio/
  F:     include/linux/gpio.h
  F:     include/linux/of_gpio.h
-F:     include/asm-generic/gpio.h
  F:     include/uapi/linux/gpio.h
  F:     tools/gpio/
  
@@ -8055,8 +8056,8 @@ F:        drivers/scsi/ips.*
  ICH LPC AND GPIO DRIVER
  M:     Peter Tyser <ptyser@xes-inc.com>
  S:     Maintained
-F:     drivers/mfd/lpc_ich.c
  F:     drivers/gpio/gpio-ich.c
+F:     drivers/mfd/lpc_ich.c
  
  ICY I2C DRIVER
  M:     Max Staudt <max@enpas.org>
@@ -8392,7 +8393,7 @@ M:        Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
  M:     Rodrigo Vivi <rodrigo.vivi@intel.com>
  L:     intel-gfx@lists.freedesktop.org
  W:     https://01.org/linuxgraphics/
-B:     https://01.org/linuxgraphics/documentation/how-report-bugs
+B:     https://gitlab.freedesktop.org/drm/intel/-/wikis/How-to-file-i915-bugs
  C:     irc://chat.freenode.net/intel-gfx
  Q:     http://patchwork.freedesktop.org/project/intel-gfx/
  T:     git git://anongit.freedesktop.org/drm-intel
@@ -9278,7 +9279,7 @@ F:        include/keys/trusted-type.h
  F:     security/keys/trusted.c
  F:     include/keys/trusted.h
  
-KEYS/KEYRINGS:
+KEYS/KEYRINGS
  M:     David Howells <dhowells@redhat.com>
  M:     Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
  L:     keyrings@vger.kernel.org
@@ -11114,14 +11115,12 @@ S:    Maintained
  F:     drivers/usb/image/microtek.*
  
  MIPS
-M:     Ralf Baechle <ralf@linux-mips.org>
-M:     Paul Burton <paulburton@kernel.org>
+M:     Thomas Bogendoerfer <tsbogend@alpha.franken.de>
  L:     linux-mips@vger.kernel.org
  W:     http://www.linux-mips.org/
-T:     git git://git.linux-mips.org/pub/scm/ralf/linux.git
  T:     git git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux.git
  Q:     http://patchwork.linux-mips.org/project/linux-mips/list/
-S:     Supported
+S:     Maintained
  F:     Documentation/devicetree/bindings/mips/
  F:     Documentation/mips/
  F:     arch/mips/
@@ -11484,7 +11483,7 @@ F:      drivers/scsi/mac_scsi.*
  F:     drivers/scsi/sun3_scsi.*
  F:     drivers/scsi/sun3_scsi_vme.c
  
-NCSI LIBRARY:
+NCSI LIBRARY
  M:     Samuel Mendoza-Jonas <sam@mendozajonas.com>
  S:     Maintained
  F:     net/ncsi/
@@ -12740,7 +12739,7 @@ M:      Tom Joseph <tjoseph@cadence.com>
  L:     linux-pci@vger.kernel.org
  S:     Maintained
  F:     Documentation/devicetree/bindings/pci/cdns,*.txt
-F:     drivers/pci/controller/pcie-cadence*
+F:     drivers/pci/controller/cadence/
  
  PCI DRIVER FOR FREESCALE LAYERSCAPE
  M:     Minghuan Lian <minghuan.Lian@nxp.com>
@@ -13512,7 +13511,7 @@ L:      linuxppc-dev@lists.ozlabs.org
  S:     Maintained
  F:     drivers/block/ps3vram.c
  
-PSAMPLE PACKET SAMPLING SUPPORT:
+PSAMPLE PACKET SAMPLING SUPPORT
  M:     Yotam Gigi <yotam.gi@gmail.com>
  S:     Maintained
  F:     net/psample
@@ -14582,10 +14581,10 @@ F:    drivers/media/pci/saa7146/
  F:     include/media/drv-intf/saa7146*
  
  SAFESETID SECURITY MODULE
-M:     Micah Morton <mortonm@chromium.org>
-S:     Supported
-F:     security/safesetid/
-F:     Documentation/admin-guide/LSM/SafeSetID.rst
+M:     Micah Morton <mortonm@chromium.org>
+S:     Supported
+F:     security/safesetid/
+F:     Documentation/admin-guide/LSM/SafeSetID.rst
  
  SAMSUNG AUDIO (ASoC) DRIVERS
  M:     Krzysztof Kozlowski <krzk@kernel.org>
@@ -16075,8 +16074,8 @@ F:      Documentation/devicetree/bindings/reset/snps,axs10x-reset.txt
  SYNOPSYS CREG GPIO DRIVER
  M:     Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
  S:     Maintained
-F:     drivers/gpio/gpio-creg-snps.c
  F:     Documentation/devicetree/bindings/gpio/snps,creg-gpio.txt
+F:     drivers/gpio/gpio-creg-snps.c
  
  SYNOPSYS DESIGNWARE 8250 UART DRIVER
  R:     Andy Shevchenko <andriy.shevchenko@linux.intel.com>
@@ -16087,8 +16086,8 @@ SYNOPSYS DESIGNWARE APB GPIO DRIVER
  M:     Hoan Tran <hoan@os.amperecomputing.com>
  L:     linux-gpio@vger.kernel.org
  S:     Maintained
-F:     drivers/gpio/gpio-dwapb.c
  F:     Documentation/devicetree/bindings/gpio/snps-dwapb-gpio.txt
+F:     drivers/gpio/gpio-dwapb.c
  
  SYNOPSYS DESIGNWARE AXI DMAC DRIVER
  M:     Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
@@ -16552,8 +16551,8 @@ M:      Michael Jamet <michael.jamet@intel.com>
  M:     Mika Westerberg <mika.westerberg@linux.intel.com>
  M:     Yehezkel Bernat <YehezkelShB@gmail.com>
  L:     linux-usb@vger.kernel.org
-T:     git git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt.git
  S:     Maintained
+T:     git git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt.git
  F:     Documentation/admin-guide/thunderbolt.rst
  F:     drivers/thunderbolt/
  F:     include/linux/thunderbolt.h
@@ -17080,7 +17079,7 @@ S:      Maintained
  F:     Documentation/admin-guide/ufs.rst
  F:     fs/ufs/
  
-UHID USERSPACE HID IO DRIVER:
+UHID USERSPACE HID IO DRIVER
  M:     David Herrmann <dh.herrmann@googlemail.com>
  L:     linux-input@vger.kernel.org
  S:     Maintained
@@ -17094,18 +17093,18 @@ S:    Maintained
  F:     drivers/usb/common/ulpi.c
  F:     include/linux/ulpi/
  
-ULTRA-WIDEBAND (UWB) SUBSYSTEM:
+ULTRA-WIDEBAND (UWB) SUBSYSTEM
  L:     devel@driverdev.osuosl.org
  S:     Obsolete
  F:     drivers/staging/uwb/
  
-UNICODE SUBSYSTEM:
+UNICODE SUBSYSTEM
  M:     Gabriel Krisman Bertazi <krisman@collabora.com>
  L:     linux-fsdevel@vger.kernel.org
  S:     Supported
  F:     fs/unicode/
  
-UNICORE32 ARCHITECTURE:
+UNICORE32 ARCHITECTURE
  M:     Guan Xuetao <gxt@pku.edu.cn>
  W:     http://mprc.pku.edu.cn/~guanxuetao/linux
  S:     Maintained
@@ -17392,11 +17391,14 @@ F:    drivers/usb/
  F:     include/linux/usb.h
  F:     include/linux/usb/
  
-USB TYPEC PI3USB30532 MUX DRIVER
-M:     Hans de Goede <hdegoede@redhat.com>
+USB TYPEC BUS FOR ALTERNATE MODES
+M:     Heikki Krogerus <heikki.krogerus@linux.intel.com>
  L:     linux-usb@vger.kernel.org
  S:     Maintained
-F:     drivers/usb/typec/mux/pi3usb30532.c
+F:     Documentation/ABI/testing/sysfs-bus-typec
+F:     Documentation/driver-api/usb/typec_bus.rst
+F:     drivers/usb/typec/altmodes/
+F:     include/linux/usb/typec_altmode.h
  
  USB TYPEC CLASS
  M:     Heikki Krogerus <heikki.krogerus@linux.intel.com>
@@ -17407,14 +17409,11 @@ F:    Documentation/driver-api/usb/typec.rst
  F:     drivers/usb/typec/
  F:     include/linux/usb/typec.h
  
-USB TYPEC BUS FOR ALTERNATE MODES
-M:     Heikki Krogerus <heikki.krogerus@linux.intel.com>
+USB TYPEC PI3USB30532 MUX DRIVER
+M:     Hans de Goede <hdegoede@redhat.com>
  L:     linux-usb@vger.kernel.org
  S:     Maintained
-F:     Documentation/ABI/testing/sysfs-bus-typec
-F:     Documentation/driver-api/usb/typec_bus.rst
-F:     drivers/usb/typec/altmodes/
-F:     include/linux/usb/typec_altmode.h
+F:     drivers/usb/typec/mux/pi3usb30532.c
  
  USB TYPEC PORT CONTROLLER DRIVERS
  M:     Guenter Roeck <linux@roeck-us.net>
@@ -17791,7 +17790,7 @@ F:      include/linux/vbox_utils.h
  F:     include/uapi/linux/vbox*.h
  F:     drivers/virt/vboxguest/
  
-VIRTUAL BOX SHARED FOLDER VFS DRIVER:
+VIRTUAL BOX SHARED FOLDER VFS DRIVER
  M:     Hans de Goede <hdegoede@redhat.com>
  L:     linux-fsdevel@vger.kernel.org
  S:     Maintained
@@ -18414,8 +18413,8 @@ M:      Nandor Han <nandor.han@ge.com>
  M:     Semi Malinen <semi.malinen@ge.com>
  L:     linux-gpio@vger.kernel.org
  S:     Maintained
-F:     drivers/gpio/gpio-xra1403.c
  F:     Documentation/devicetree/bindings/gpio/gpio-xra1403.txt
+F:     drivers/gpio/gpio-xra1403.c
  
  XTENSA XTFPGA PLATFORM SUPPORT
  M:     Max Filippov <jcmvbkbc@gmail.com>
diff --git a/Makefile b/Makefile

index 84b71845c43f328ca56e7cfa5d6880220068abc2..86035d866f2c0e73cf326980f0d9295173a73581 100644 (file)
--- a/Makefile
+++ b/Makefile
@@ -2,7 +2,7 @@
  VERSION = 5
  PATCHLEVEL = 6
  SUBLEVEL = 0
-EXTRAVERSION = -rc1
+EXTRAVERSION = -rc4
  NAME = Kleptomaniac Octopus
  
  # *DOCUMENTATION*
@@ -68,6 +68,7 @@ unexport GREP_OPTIONS
  #
  # If KBUILD_VERBOSE equals 0 then the above command will be hidden.
  # If KBUILD_VERBOSE equals 1 then the above command is displayed.
+# If KBUILD_VERBOSE equals 2 then give the reason why each target is rebuilt.
  #
  # To put more focus on warnings, be less verbose as default
  # Use 'make V=1' to see the full commands
@@ -1238,7 +1239,7 @@ ifneq ($(dtstree),)
  %.dtb: include/config/kernel.release scripts_dtc
         $(Q)$(MAKE) $(build)=$(dtstree) $(dtstree)/$@
  
-PHONY += dtbs dtbs_install dt_binding_check
+PHONY += dtbs dtbs_install dtbs_check
  dtbs dtbs_check: include/config/kernel.release scripts_dtc
         $(Q)$(MAKE) $(build)=$(dtstree)
  
@@ -1258,6 +1259,7 @@ PHONY += scripts_dtc
  scripts_dtc: scripts_basic
         $(Q)$(MAKE) $(build)=scripts/dtc
  
+PHONY += dt_binding_check
  dt_binding_check: scripts_dtc
         $(Q)$(MAKE) $(build)=Documentation/devicetree/bindings
  
diff --git a/arch/arm/boot/dts/stih410-b2260.dts b/arch/arm/boot/dts/stih410-b2260.dts

index 4fbd8e9eb5b76d102dce79c44666130f1f7c9431..e2bb5978314620452c5ad2fd80cf1786bf1b5543 100644 (file)
--- a/arch/arm/boot/dts/stih410-b2260.dts
+++ b/arch/arm/boot/dts/stih410-b2260.dts
@@ -178,9 +178,6 @@
                         phy-mode = "rgmii";
                         pinctrl-0 = <&pinctrl_rgmii1 &pinctrl_rgmii1_mdio_1>;
  
-                       snps,phy-bus-name = "stmmac";
-                       snps,phy-bus-id = <0>;
-                       snps,phy-addr = <0>;
                         snps,reset-gpio = <&pio0 7 0>;
                         snps,reset-active-low;
                         snps,reset-delays-us = <0 10000 1000000>;
diff --git a/arch/arm/boot/dts/stihxxx-b2120.dtsi b/arch/arm/boot/dts/stihxxx-b2120.dtsi

index 60e11045ad762af74e47a70335abac94b162b403..d051f080e52ec36a55f62e0d6f642a08ec21b6d9 100644 (file)
--- a/arch/arm/boot/dts/stihxxx-b2120.dtsi
+++ b/arch/arm/boot/dts/stihxxx-b2120.dtsi
@@ -46,7 +46,7 @@
                         /* DAC */
                         format = "i2s";
                         mclk-fs = <256>;
-                       frame-inversion = <1>;
+                       frame-inversion;
                         cpu {
                                 sound-dai = <&sti_uni_player2>;
                         };
diff --git a/arch/arm/configs/am200epdkit_defconfig b/arch/arm/configs/am200epdkit_defconfig

index 622436f44783461ebd145d2879d2b5a45ffeb343..f56ac394caf10b4d6218ea24ae895d9b1795be10 100644 (file)
--- a/arch/arm/configs/am200epdkit_defconfig
+++ b/arch/arm/configs/am200epdkit_defconfig
@@ -11,8 +11,6 @@ CONFIG_SLAB=y
  CONFIG_MODULES=y
  CONFIG_MODULE_UNLOAD=y
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_DEADLINE is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_PXA=y
  CONFIG_ARCH_GUMSTIX=y
  CONFIG_PCCARD=y
diff --git a/arch/arm/configs/axm55xx_defconfig b/arch/arm/configs/axm55xx_defconfig

index f53634af014ba5407fafe49c0b9269f8af341be7..6ea7dafa4c9ea561826d2cd391d05ae823e451ee 100644 (file)
--- a/arch/arm/configs/axm55xx_defconfig
+++ b/arch/arm/configs/axm55xx_defconfig
@@ -25,7 +25,6 @@ CONFIG_EMBEDDED=y
  CONFIG_PROFILING=y
  CONFIG_MODULES=y
  CONFIG_MODULE_UNLOAD=y
-# CONFIG_IOSCHED_DEADLINE is not set
  CONFIG_ARCH_AXXIA=y
  CONFIG_GPIO_PCA953X=y
  CONFIG_ARM_LPAE=y
diff --git a/arch/arm/configs/clps711x_defconfig b/arch/arm/configs/clps711x_defconfig

index c255dab36bdec8d31747751fcf24850d4be2ff46..63a153f5cf683efe1b80e38d904284585a1b67ac 100644 (file)
--- a/arch/arm/configs/clps711x_defconfig
+++ b/arch/arm/configs/clps711x_defconfig
@@ -7,7 +7,6 @@ CONFIG_EMBEDDED=y
  CONFIG_SLOB=y
  CONFIG_JUMP_LABEL=y
  CONFIG_PARTITION_ADVANCED=y
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_CLPS711X=y
  CONFIG_ARCH_AUTCPU12=y
  CONFIG_ARCH_CDB89712=y
diff --git a/arch/arm/configs/cns3420vb_defconfig b/arch/arm/configs/cns3420vb_defconfig

index 89df0a55a0655924a5b099906c6a67d1b3eee1b8..66a80b46038d1e318920ea3fae213dd6c2bdbb13 100644 (file)
--- a/arch/arm/configs/cns3420vb_defconfig
+++ b/arch/arm/configs/cns3420vb_defconfig
@@ -17,7 +17,7 @@ CONFIG_MODULE_UNLOAD=y
  CONFIG_MODULE_FORCE_UNLOAD=y
  CONFIG_MODVERSIONS=y
  # CONFIG_BLK_DEV_BSG is not set
-CONFIG_IOSCHED_CFQ=m
+CONFIG_IOSCHED_BFQ=m
  CONFIG_ARCH_MULTI_V6=y
  #CONFIG_ARCH_MULTI_V7 is not set
  CONFIG_ARCH_CNS3XXX=y
diff --git a/arch/arm/configs/colibri_pxa300_defconfig b/arch/arm/configs/colibri_pxa300_defconfig

index 446134c70a335759ef5399c845f6b4621273f0bf..0dae3b18528400905deb609fa67acedd9f7f6729 100644 (file)
--- a/arch/arm/configs/colibri_pxa300_defconfig
+++ b/arch/arm/configs/colibri_pxa300_defconfig
@@ -43,7 +43,6 @@ CONFIG_USB_ANNOUNCE_NEW_DEVICES=y
  CONFIG_USB_MON=y
  CONFIG_USB_STORAGE=y
  CONFIG_MMC=y
-# CONFIG_MMC_BLOCK_BOUNCE is not set
  CONFIG_MMC_PXA=y
  CONFIG_EXT3_FS=y
  CONFIG_NFS_FS=y
diff --git a/arch/arm/configs/collie_defconfig b/arch/arm/configs/collie_defconfig

index e6df11e906bade8bf8335d035bc74d66231c5a35..36384fd575f8ed8b7a71cd49e0d4e0ddbb1ef03a 100644 (file)
--- a/arch/arm/configs/collie_defconfig
+++ b/arch/arm/configs/collie_defconfig
@@ -7,8 +7,6 @@ CONFIG_EXPERT=y
  # CONFIG_BASE_FULL is not set
  # CONFIG_EPOLL is not set
  CONFIG_SLOB=y
-# CONFIG_IOSCHED_DEADLINE is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_SA1100=y
  CONFIG_SA1100_COLLIE=y
  CONFIG_PCCARD=y
diff --git a/arch/arm/configs/davinci_all_defconfig b/arch/arm/configs/davinci_all_defconfig

index 231f8973bbb2d8ca242eef675b2c4fbed6ba554b..b5ba8d731a25e82964267cb43d159ec314a15f59 100644 (file)
--- a/arch/arm/configs/davinci_all_defconfig
+++ b/arch/arm/configs/davinci_all_defconfig
@@ -15,8 +15,6 @@ CONFIG_MODULE_UNLOAD=y
  CONFIG_MODULE_FORCE_UNLOAD=y
  CONFIG_MODVERSIONS=y
  CONFIG_PARTITION_ADVANCED=y
-# CONFIG_IOSCHED_DEADLINE is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_MULTIPLATFORM=y
  CONFIG_ARCH_MULTI_V7=n
  CONFIG_ARCH_MULTI_V5=y
diff --git a/arch/arm/configs/efm32_defconfig b/arch/arm/configs/efm32_defconfig

index 10ea92513a69d07843ea38a598c0bb6409fcd97e..46213f0530c4af6138b5c8905c5af6d513766dfc 100644 (file)
--- a/arch/arm/configs/efm32_defconfig
+++ b/arch/arm/configs/efm32_defconfig
@@ -12,8 +12,6 @@ CONFIG_EMBEDDED=y
  # CONFIG_VM_EVENT_COUNTERS is not set
  # CONFIG_SLUB_DEBUG is not set
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_DEADLINE is not set
-# CONFIG_IOSCHED_CFQ is not set
  # CONFIG_MMU is not set
  CONFIG_ARM_SINGLE_ARMV7M=y
  CONFIG_ARCH_EFM32=y
diff --git a/arch/arm/configs/ep93xx_defconfig b/arch/arm/configs/ep93xx_defconfig

index ef2d2a820c30b20a76285d9301666542656fb71d..cd16fb6eb8e63e7c1c85eba78919db25f4dbf702 100644 (file)
--- a/arch/arm/configs/ep93xx_defconfig
+++ b/arch/arm/configs/ep93xx_defconfig
@@ -11,7 +11,6 @@ CONFIG_MODULE_UNLOAD=y
  CONFIG_MODULE_FORCE_UNLOAD=y
  # CONFIG_BLK_DEV_BSG is not set
  CONFIG_PARTITION_ADVANCED=y
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_EP93XX=y
  CONFIG_CRUNCH=y
  CONFIG_MACH_ADSSPHERE=y
diff --git a/arch/arm/configs/eseries_pxa_defconfig b/arch/arm/configs/eseries_pxa_defconfig

index 56452fa03d56782329cd4d7fb257b643b9cf3279..046f4dc2e18e34de0de2dcdeb2b53fd2b68e8474 100644 (file)
--- a/arch/arm/configs/eseries_pxa_defconfig
+++ b/arch/arm/configs/eseries_pxa_defconfig
@@ -9,8 +9,6 @@ CONFIG_MODULES=y
  CONFIG_MODULE_UNLOAD=y
  CONFIG_MODULE_FORCE_UNLOAD=y
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_DEADLINE is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_PXA=y
  CONFIG_ARCH_PXA_ESERIES=y
  # CONFIG_ARM_THUMB is not set
diff --git a/arch/arm/configs/ezx_defconfig b/arch/arm/configs/ezx_defconfig

index 4e28771beecdb7d53aaf74953b03202ed955b503..bd7b7f945e0188734ca54e156fcf848bf1295933 100644 (file)
--- a/arch/arm/configs/ezx_defconfig
+++ b/arch/arm/configs/ezx_defconfig
@@ -14,7 +14,6 @@ CONFIG_MODULE_UNLOAD=y
  CONFIG_MODULE_FORCE_UNLOAD=y
  CONFIG_MODVERSIONS=y
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_PXA=y
  CONFIG_PXA_EZX=y
  CONFIG_NO_HZ=y
diff --git a/arch/arm/configs/h3600_defconfig b/arch/arm/configs/h3600_defconfig

index 4d91e41cb628bf0982f9245bdba2a74dff1b053a..c02b3e4096101a4096da59fbb575e428ac7db656 100644 (file)
--- a/arch/arm/configs/h3600_defconfig
+++ b/arch/arm/configs/h3600_defconfig
@@ -5,8 +5,6 @@ CONFIG_LOG_BUF_SHIFT=14
  CONFIG_BLK_DEV_INITRD=y
  CONFIG_MODULES=y
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_DEADLINE is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_SA1100=y
  CONFIG_SA1100_H3600=y
  CONFIG_PCCARD=y
diff --git a/arch/arm/configs/h5000_defconfig b/arch/arm/configs/h5000_defconfig

index 3946c608732724b01d79d2dd97a41e5fbb8b27df..f5a338fefda8ed34842f9ba3245ded25b2539ea8 100644 (file)
--- a/arch/arm/configs/h5000_defconfig
+++ b/arch/arm/configs/h5000_defconfig
@@ -10,7 +10,6 @@ CONFIG_MODULES=y
  CONFIG_MODULE_UNLOAD=y
  CONFIG_MODULE_FORCE_UNLOAD=y
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_PXA=y
  CONFIG_MACH_H5000=y
  CONFIG_AEABI=y
diff --git a/arch/arm/configs/imote2_defconfig b/arch/arm/configs/imote2_defconfig

index 770469f61c3e44bc92de074577ba20e3521c4be8..05c5515fa8710fd428ab8e8384e517472c7b6b47 100644 (file)
--- a/arch/arm/configs/imote2_defconfig
+++ b/arch/arm/configs/imote2_defconfig
@@ -13,7 +13,6 @@ CONFIG_MODULE_UNLOAD=y
  CONFIG_MODULE_FORCE_UNLOAD=y
  CONFIG_MODVERSIONS=y
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_PXA=y
  CONFIG_MACH_INTELMOTE2=y
  CONFIG_NO_HZ=y
diff --git a/arch/arm/configs/imx_v4_v5_defconfig b/arch/arm/configs/imx_v4_v5_defconfig

index 2b2d617e279d4232793e2bf0bb235813ba0e6061..3df90fc383983d2b9e902690d2b3285e8a8308f7 100644 (file)
--- a/arch/arm/configs/imx_v4_v5_defconfig
+++ b/arch/arm/configs/imx_v4_v5_defconfig
@@ -32,8 +32,6 @@ CONFIG_KPROBES=y
  CONFIG_MODULES=y
  CONFIG_MODULE_UNLOAD=y
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_DEADLINE is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_NET=y
  CONFIG_PACKET=y
  CONFIG_UNIX=y
diff --git a/arch/arm/configs/lpc18xx_defconfig b/arch/arm/configs/lpc18xx_defconfig

index e518168a06276363988558da66884a5440d3a613..be882ea0eee4662ecb631648edb2466e06de3dfc 100644 (file)
--- a/arch/arm/configs/lpc18xx_defconfig
+++ b/arch/arm/configs/lpc18xx_defconfig
@@ -1,4 +1,3 @@
-CONFIG_CROSS_COMPILE="arm-linux-gnueabihf-"
  CONFIG_HIGH_RES_TIMERS=y
  CONFIG_PREEMPT=y
  CONFIG_BLK_DEV_INITRD=y
@@ -28,10 +27,7 @@ CONFIG_FLASH_SIZE=0x00080000
  CONFIG_ZBOOT_ROM_TEXT=0x0
  CONFIG_ZBOOT_ROM_BSS=0x0
  CONFIG_ARM_APPENDED_DTB=y
-# CONFIG_LBDAF is not set
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_DEADLINE is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_BINFMT_FLAT=y
  CONFIG_BINFMT_ZFLAT=y
  CONFIG_BINFMT_SHARED_FLAT=y
diff --git a/arch/arm/configs/magician_defconfig b/arch/arm/configs/magician_defconfig

index e6486c95922065e353889585e69fc6d16595899c..d2e684f6565a161f2b277eaab76de7484a67de78 100644 (file)
--- a/arch/arm/configs/magician_defconfig
+++ b/arch/arm/configs/magician_defconfig
@@ -9,8 +9,6 @@ CONFIG_SLAB=y
  CONFIG_MODULES=y
  CONFIG_MODULE_UNLOAD=y
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_DEADLINE is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_PXA=y
  CONFIG_MACH_H4700=y
  CONFIG_MACH_MAGICIAN=y
diff --git a/arch/arm/configs/moxart_defconfig b/arch/arm/configs/moxart_defconfig

index 45d27190c9c96e56009dd1116bd7bc4cedf6a946..6834e97af34861b4df3ab8c12dd7ace7b0d4efc2 100644 (file)
--- a/arch/arm/configs/moxart_defconfig
+++ b/arch/arm/configs/moxart_defconfig
@@ -15,7 +15,6 @@ CONFIG_EMBEDDED=y
  # CONFIG_SLUB_DEBUG is not set
  # CONFIG_COMPAT_BRK is not set
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_DEADLINE is not set
  CONFIG_ARCH_MULTI_V4=y
  # CONFIG_ARCH_MULTI_V7 is not set
  CONFIG_ARCH_MOXART=y
diff --git a/arch/arm/configs/mxs_defconfig b/arch/arm/configs/mxs_defconfig

index 2773899c21b384ff4e0672c942a61cee90840350..a9c6f32a9b1c9d04ddca4ea9d7690aadcb143a40 100644 (file)
--- a/arch/arm/configs/mxs_defconfig
+++ b/arch/arm/configs/mxs_defconfig
@@ -25,8 +25,6 @@ CONFIG_MODULE_UNLOAD=y
  CONFIG_MODULE_FORCE_UNLOAD=y
  CONFIG_MODVERSIONS=y
  CONFIG_BLK_DEV_INTEGRITY=y
-# CONFIG_IOSCHED_DEADLINE is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_NET=y
  CONFIG_PACKET=y
  CONFIG_UNIX=y
diff --git a/arch/arm/configs/omap1_defconfig b/arch/arm/configs/omap1_defconfig

index 0c43c589f191c991b0bf017935776a1eba98c18d..3b6e7452609ba9010934f7ea8a6bcab9dbdbfa3a 100644 (file)
--- a/arch/arm/configs/omap1_defconfig
+++ b/arch/arm/configs/omap1_defconfig
@@ -18,8 +18,6 @@ CONFIG_MODULES=y
  CONFIG_MODULE_UNLOAD=y
  CONFIG_MODULE_FORCE_UNLOAD=y
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_DEADLINE is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_OMAP=y
  CONFIG_ARCH_OMAP1=y
  CONFIG_OMAP_RESET_CLOCKS=y
diff --git a/arch/arm/configs/palmz72_defconfig b/arch/arm/configs/palmz72_defconfig

index 4a3fd82c2a0c4a99cec375277c1cb25685efadc3..b47c8abe85bcc657759aaed0f28fb6e9869d825a 100644 (file)
--- a/arch/arm/configs/palmz72_defconfig
+++ b/arch/arm/configs/palmz72_defconfig
@@ -7,8 +7,6 @@ CONFIG_SLAB=y
  CONFIG_MODULES=y
  CONFIG_MODULE_UNLOAD=y
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_DEADLINE is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_PXA=y
  CONFIG_ARCH_PXA_PALM=y
  # CONFIG_MACH_PALMTX is not set
diff --git a/arch/arm/configs/pcm027_defconfig b/arch/arm/configs/pcm027_defconfig

index a8c53228b0c180cc90f2f6493b0331986e19ddb3..e97a158081fc755abe281c75841581ad66002245 100644 (file)
--- a/arch/arm/configs/pcm027_defconfig
+++ b/arch/arm/configs/pcm027_defconfig
@@ -13,8 +13,6 @@ CONFIG_MODULES=y
  CONFIG_MODULE_UNLOAD=y
  CONFIG_MODULE_FORCE_UNLOAD=y
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_DEADLINE is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_PXA=y
  CONFIG_MACH_PCM027=y
  CONFIG_MACH_PCM990_BASEBOARD=y
diff --git a/arch/arm/configs/pleb_defconfig b/arch/arm/configs/pleb_defconfig

index f0541b060cfaf64f4efb9ee7a16b78a9d9040e9c..2170148b975cedd7462623183241aac3620d62b3 100644 (file)
--- a/arch/arm/configs/pleb_defconfig
+++ b/arch/arm/configs/pleb_defconfig
@@ -6,8 +6,6 @@ CONFIG_EXPERT=y
  # CONFIG_HOTPLUG is not set
  # CONFIG_SHMEM is not set
  CONFIG_MODULES=y
-# CONFIG_IOSCHED_DEADLINE is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_SA1100=y
  CONFIG_SA1100_PLEB=y
  CONFIG_ZBOOT_ROM_TEXT=0x0
diff --git a/arch/arm/configs/realview_defconfig b/arch/arm/configs/realview_defconfig

index 8a056cc0c1ec21285d891e2ac68311ff4bfcbc01..70e2c74a9f32d79d92e47a8d355e9cced1b4d700 100644 (file)
--- a/arch/arm/configs/realview_defconfig
+++ b/arch/arm/configs/realview_defconfig
@@ -8,7 +8,6 @@ CONFIG_SLAB=y
  CONFIG_MODULES=y
  CONFIG_MODULE_UNLOAD=y
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_MULTI_V6=y
  CONFIG_ARCH_REALVIEW=y
  CONFIG_MACH_REALVIEW_EB=y
diff --git a/arch/arm/configs/sama5_defconfig b/arch/arm/configs/sama5_defconfig

index 27f6135c4ee73dc23e3b213582e426e833d5aafe..bab7861443dcf5330d87386261037beccc418671 100644 (file)
--- a/arch/arm/configs/sama5_defconfig
+++ b/arch/arm/configs/sama5_defconfig
@@ -14,8 +14,6 @@ CONFIG_MODULE_FORCE_LOAD=y
  CONFIG_MODULE_UNLOAD=y
  CONFIG_MODULE_FORCE_UNLOAD=y
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_DEADLINE is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_AT91=y
  CONFIG_SOC_SAMA5D2=y
  CONFIG_SOC_SAMA5D3=y
@@ -182,7 +180,6 @@ CONFIG_USB_GADGET=y
  CONFIG_USB_ATMEL_USBA=y
  CONFIG_USB_G_SERIAL=y
  CONFIG_MMC=y
-# CONFIG_MMC_BLOCK_BOUNCE is not set
  CONFIG_MMC_SDHCI=y
  CONFIG_MMC_SDHCI_PLTFM=y
  CONFIG_MMC_SDHCI_OF_AT91=y
diff --git a/arch/arm/configs/stm32_defconfig b/arch/arm/configs/stm32_defconfig

index 152321d2893eb97d38cc40508e2e266e22c4ef9c..551db328009dd697194718dd349d6223febb0122 100644 (file)
--- a/arch/arm/configs/stm32_defconfig
+++ b/arch/arm/configs/stm32_defconfig
@@ -14,8 +14,6 @@ CONFIG_EMBEDDED=y
  # CONFIG_VM_EVENT_COUNTERS is not set
  # CONFIG_SLUB_DEBUG is not set
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_DEADLINE is not set
-# CONFIG_IOSCHED_CFQ is not set
  # CONFIG_MMU is not set
  CONFIG_ARCH_STM32=y
  CONFIG_CPU_V7M_NUM_IRQ=240
diff --git a/arch/arm/configs/sunxi_defconfig b/arch/arm/configs/sunxi_defconfig

index 3f5d727efc41138a4a4c511c6d2f35120f5b8f51..e9fb57374b9f3ef521a45caf4a8e93d994b6f701 100644 (file)
--- a/arch/arm/configs/sunxi_defconfig
+++ b/arch/arm/configs/sunxi_defconfig
@@ -85,6 +85,7 @@ CONFIG_BATTERY_AXP20X=y
  CONFIG_AXP20X_POWER=y
  CONFIG_THERMAL=y
  CONFIG_CPU_THERMAL=y
+CONFIG_SUN8I_THERMAL=y
  CONFIG_WATCHDOG=y
  CONFIG_SUNXI_WATCHDOG=y
  CONFIG_MFD_AC100=y
diff --git a/arch/arm/configs/u300_defconfig b/arch/arm/configs/u300_defconfig

index 8223397db047eb7939b03be3dceff1b7558d847b..543f07338100e0c064eb4c668324c3190072cf0c 100644 (file)
--- a/arch/arm/configs/u300_defconfig
+++ b/arch/arm/configs/u300_defconfig
@@ -11,7 +11,6 @@ CONFIG_MODULES=y
  CONFIG_MODULE_UNLOAD=y
  # CONFIG_BLK_DEV_BSG is not set
  CONFIG_PARTITION_ADVANCED=y
-# CONFIG_IOSCHED_CFQ is not set
  # CONFIG_ARCH_MULTI_V7 is not set
  CONFIG_ARCH_U300=y
  CONFIG_MACH_U300_SPIDUMMY=y
@@ -46,7 +45,6 @@ CONFIG_FB=y
  CONFIG_BACKLIGHT_CLASS_DEVICE=y
  # CONFIG_USB_SUPPORT is not set
  CONFIG_MMC=y
-# CONFIG_MMC_BLOCK_BOUNCE is not set
  CONFIG_MMC_ARMMMCI=y
  CONFIG_RTC_CLASS=y
  # CONFIG_RTC_HCTOSYS is not set
diff --git a/arch/arm/configs/vexpress_defconfig b/arch/arm/configs/vexpress_defconfig

index 25753552277ad18f973d7e0877dc2d741d014f6d..c01baf7d6e37c228bc8441747d1c8018de1213fc 100644 (file)
--- a/arch/arm/configs/vexpress_defconfig
+++ b/arch/arm/configs/vexpress_defconfig
@@ -15,8 +15,6 @@ CONFIG_OPROFILE=y
  CONFIG_MODULES=y
  CONFIG_MODULE_UNLOAD=y
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_DEADLINE is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_VEXPRESS=y
  CONFIG_ARCH_VEXPRESS_DCSCB=y
  CONFIG_ARCH_VEXPRESS_TC2_PM=y
diff --git a/arch/arm/configs/viper_defconfig b/arch/arm/configs/viper_defconfig

index 2ff16168d9c2892df5b5ba31f1cd8cbcdaa2ec99..989599ce53008213f9be251bc2f4f6f605abdbcb 100644 (file)
--- a/arch/arm/configs/viper_defconfig
+++ b/arch/arm/configs/viper_defconfig
@@ -9,7 +9,6 @@ CONFIG_SLAB=y
  CONFIG_MODULES=y
  CONFIG_MODULE_UNLOAD=y
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_PXA=y
  CONFIG_ARCH_VIPER=y
  CONFIG_IWMMXT=y
diff --git a/arch/arm/configs/zeus_defconfig b/arch/arm/configs/zeus_defconfig

index aa3023c9a01196b43ec816094ee8302d7d197dd8..d3b98c4d225bec5470a72fca0ab0e9f1982da139 100644 (file)
--- a/arch/arm/configs/zeus_defconfig
+++ b/arch/arm/configs/zeus_defconfig
@@ -4,7 +4,6 @@ CONFIG_LOG_BUF_SHIFT=13
  CONFIG_MODULES=y
  CONFIG_MODULE_UNLOAD=y
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_PXA=y
  CONFIG_MACH_ARCOM_ZEUS=y
  CONFIG_PCCARD=m
@@ -137,7 +136,6 @@ CONFIG_USB_MASS_STORAGE=m
  CONFIG_USB_G_SERIAL=m
  CONFIG_USB_G_PRINTER=m
  CONFIG_MMC=y
-# CONFIG_MMC_BLOCK_BOUNCE is not set
  CONFIG_MMC_PXA=y
  CONFIG_NEW_LEDS=y
  CONFIG_LEDS_CLASS=m
diff --git a/arch/arm/configs/zx_defconfig b/arch/arm/configs/zx_defconfig

index 4d2ef785ed344f9c4d77cbbf9130128da307c423..a046a492bfa73442cb17dee01746cce7f45ea3eb 100644 (file)
--- a/arch/arm/configs/zx_defconfig
+++ b/arch/arm/configs/zx_defconfig
@@ -16,7 +16,6 @@ CONFIG_EMBEDDED=y
  CONFIG_PERF_EVENTS=y
  CONFIG_SLAB=y
  # CONFIG_BLK_DEV_BSG is not set
-# CONFIG_IOSCHED_CFQ is not set
  CONFIG_ARCH_ZX=y
  CONFIG_SOC_ZX296702=y
  # CONFIG_SWP_EMULATE is not set
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h

index c3314b286a61ea6feac86430fd576e834156dbe8..a827b4d60d389f3936a8185c4769f01fe9397226 100644 (file)
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -392,9 +392,6 @@ static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {}
  static inline void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu) {}
  static inline void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu) {}
  
-static inline void kvm_arm_vhe_guest_enter(void) {}
-static inline void kvm_arm_vhe_guest_exit(void) {}
-
  #define KVM_BP_HARDEN_UNKNOWN          -1
  #define KVM_BP_HARDEN_WA_NEEDED                0
  #define KVM_BP_HARDEN_NOT_REQUIRED     1
diff --git a/arch/arm/mach-npcm/Kconfig b/arch/arm/mach-npcm/Kconfig

index 880bc2a5cadaa95a9f4551979c3f6d6b8ffab435..7f7002dc2b21fec89afd25529c87d7ca16630cf2 100644 (file)
--- a/arch/arm/mach-npcm/Kconfig
+++ b/arch/arm/mach-npcm/Kconfig
@@ -11,7 +11,7 @@ config ARCH_NPCM7XX
         depends on ARCH_MULTI_V7
         select PINCTRL_NPCM7XX
         select NPCM7XX_TIMER
-       select ARCH_REQUIRE_GPIOLIB
+       select GPIOLIB
         select CACHE_L2X0
         select ARM_GIC
         select HAVE_ARM_TWD if SMP
diff --git a/arch/arm64/boot/dts/arm/fvp-base-revc.dts b/arch/arm64/boot/dts/arm/fvp-base-revc.dts

index 62ab0d54ff71ed4f513837d24fdfae1a743f773f..335fff76245167467fbdf3932ec42e0a84c0c586 100644 (file)
--- a/arch/arm64/boot/dts/arm/fvp-base-revc.dts
+++ b/arch/arm64/boot/dts/arm/fvp-base-revc.dts
@@ -161,10 +161,10 @@
                 bus-range = <0x0 0x1>;
                 reg = <0x0 0x40000000 0x0 0x10000000>;
                 ranges = <0x2000000 0x0 0x50000000 0x0 0x50000000 0x0 0x10000000>;
-               interrupt-map = <0 0 0 1 &gic GIC_SPI 168 IRQ_TYPE_LEVEL_HIGH>,
-                               <0 0 0 2 &gic GIC_SPI 169 IRQ_TYPE_LEVEL_HIGH>,
-                               <0 0 0 3 &gic GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
-                               <0 0 0 4 &gic GIC_SPI 171 IRQ_TYPE_LEVEL_HIGH>;
+               interrupt-map = <0 0 0 1 &gic 0 0 GIC_SPI 168 IRQ_TYPE_LEVEL_HIGH>,
+                               <0 0 0 2 &gic 0 0 GIC_SPI 169 IRQ_TYPE_LEVEL_HIGH>,
+                               <0 0 0 3 &gic 0 0 GIC_SPI 170 IRQ_TYPE_LEVEL_HIGH>,
+                               <0 0 0 4 &gic 0 0 GIC_SPI 171 IRQ_TYPE_LEVEL_HIGH>;
                 interrupt-map-mask = <0x0 0x0 0x0 0x7>;
                 msi-map = <0x0 &its 0x0 0x10000>;
                 iommu-map = <0x0 &smmu 0x0 0x10000>;
diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig

index 0f212889c931a83db236f3ad8c7cdcac0235e7c6..905109f6814ff9e026c994d87dc65d6bf4370dac 100644 (file)
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -452,6 +452,7 @@ CONFIG_THERMAL_GOV_POWER_ALLOCATOR=y
  CONFIG_CPU_THERMAL=y
  CONFIG_THERMAL_EMULATION=y
  CONFIG_QORIQ_THERMAL=m
+CONFIG_SUN8I_THERMAL=y
  CONFIG_ROCKCHIP_THERMAL=m
  CONFIG_RCAR_THERMAL=y
  CONFIG_RCAR_GEN3_THERMAL=y
@@ -547,6 +548,7 @@ CONFIG_ROCKCHIP_DW_MIPI_DSI=y
  CONFIG_ROCKCHIP_INNO_HDMI=y
  CONFIG_DRM_RCAR_DU=m
  CONFIG_DRM_SUN4I=m
+CONFIG_DRM_SUN6I_DSI=m
  CONFIG_DRM_SUN8I_DW_HDMI=m
  CONFIG_DRM_SUN8I_MIXER=m
  CONFIG_DRM_MSM=m
@@ -681,7 +683,7 @@ CONFIG_RTC_DRV_SNVS=m
  CONFIG_RTC_DRV_IMX_SC=m
  CONFIG_RTC_DRV_XGENE=y
  CONFIG_DMADEVICES=y
-CONFIG_DMA_BCM2835=m
+CONFIG_DMA_BCM2835=y
  CONFIG_DMA_SUN6I=m
  CONFIG_FSL_EDMA=y
  CONFIG_IMX_SDMA=y
diff --git a/arch/arm64/include/asm/arch_gicv3.h b/arch/arm64/include/asm/arch_gicv3.h

index 25fec4bde43af6f66bbb0e170a888d154eb82535..a358e97572c14c58d55d732926cf3f211a71a410 100644 (file)
--- a/arch/arm64/include/asm/arch_gicv3.h
+++ b/arch/arm64/include/asm/arch_gicv3.h
@@ -32,7 +32,7 @@ static inline void gic_write_eoir(u32 irq)
         isb();
  }
  
-static inline void gic_write_dir(u32 irq)
+static __always_inline void gic_write_dir(u32 irq)
  {
         write_sysreg_s(irq, SYS_ICC_DIR_EL1);
         isb();
diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h

index 806e9dc2a852a43d54f9eeb89122684553971b16..a4d1b5f771f6baebd64c3c291be937e84eddd397 100644 (file)
--- a/arch/arm64/include/asm/cache.h
+++ b/arch/arm64/include/asm/cache.h
@@ -69,7 +69,7 @@ static inline int icache_is_aliasing(void)
         return test_bit(ICACHEF_ALIASING, &__icache_flags);
  }
  
-static inline int icache_is_vpipt(void)
+static __always_inline int icache_is_vpipt(void)
  {
         return test_bit(ICACHEF_VPIPT, &__icache_flags);
  }
diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h

index 665c78e0665a65ea89f40442bef31b55bebf9147..e6cca3d4acf702f060bb35996ead0762566de7b6 100644 (file)
--- a/arch/arm64/include/asm/cacheflush.h
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -145,7 +145,7 @@ extern void copy_to_user_page(struct vm_area_struct *, struct page *,
  #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
  extern void flush_dcache_page(struct page *);
  
-static inline void __flush_icache_all(void)
+static __always_inline void __flush_icache_all(void)
  {
         if (cpus_have_const_cap(ARM64_HAS_CACHE_DIC))
                 return;
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h

index 92ef9539874a663b3bd693c4e834a5cf134a21d6..2a746b99e937f489cac93452fe4a11dfbc1c17d8 100644 (file)
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -435,13 +435,13 @@ cpuid_feature_extract_signed_field(u64 features, int field)
         return cpuid_feature_extract_signed_field_width(features, field, 4);
  }
  
-static inline unsigned int __attribute_const__
+static __always_inline unsigned int __attribute_const__
  cpuid_feature_extract_unsigned_field_width(u64 features, int field, int width)
  {
         return (u64)(features << (64 - width - field)) >> (64 - width);
  }
  
-static inline unsigned int __attribute_const__
+static __always_inline unsigned int __attribute_const__
  cpuid_feature_extract_unsigned_field(u64 features, int field)
  {
         return cpuid_feature_extract_unsigned_field_width(features, field, 4);
@@ -564,7 +564,7 @@ static inline bool system_supports_mixed_endian(void)
         return val == 0x1;
  }
  
-static inline bool system_supports_fpsimd(void)
+static __always_inline bool system_supports_fpsimd(void)
  {
         return !cpus_have_const_cap(ARM64_HAS_NO_FPSIMD);
  }
@@ -575,13 +575,13 @@ static inline bool system_uses_ttbr0_pan(void)
                 !cpus_have_const_cap(ARM64_HAS_PAN);
  }
  
-static inline bool system_supports_sve(void)
+static __always_inline bool system_supports_sve(void)
  {
         return IS_ENABLED(CONFIG_ARM64_SVE) &&
                 cpus_have_const_cap(ARM64_SVE);
  }
  
-static inline bool system_supports_cnp(void)
+static __always_inline bool system_supports_cnp(void)
  {
         return IS_ENABLED(CONFIG_ARM64_CNP) &&
                 cpus_have_const_cap(ARM64_HAS_CNP);
diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h

index b87c6e276ab194e4d80cf85150c844c94ee2d5b5..7a6e81ca23a8e0ed5a11013fc74f3ca82cefd1c5 100644 (file)
--- a/arch/arm64/include/asm/exception.h
+++ b/arch/arm64/include/asm/exception.h
@@ -33,7 +33,6 @@ static inline u32 disr_to_esr(u64 disr)
  
  asmlinkage void enter_from_user_mode(void);
  void do_mem_abort(unsigned long addr, unsigned int esr, struct pt_regs *regs);
-void do_sp_pc_abort(unsigned long addr, unsigned int esr, struct pt_regs *regs);
  void do_undefinstr(struct pt_regs *regs);
  asmlinkage void bad_mode(struct pt_regs *regs, int reason, unsigned int esr);
  void do_debug_exception(unsigned long addr_if_watchpoint, unsigned int esr,
@@ -47,7 +46,4 @@ void bad_el0_sync(struct pt_regs *regs, int reason, unsigned int esr);
  void do_cp15instr(unsigned int esr, struct pt_regs *regs);
  void do_el0_svc(struct pt_regs *regs);
  void do_el0_svc_compat(struct pt_regs *regs);
-void do_el0_ia_bp_hardening(unsigned long addr,  unsigned int esr,
-                           struct pt_regs *regs);
-
  #endif /* __ASM_EXCEPTION_H */
diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h

index 4e531f57147d122aec8c87a800a08e33400b3c51..6facd1308e7c28c54ef35e762652f7cdb364322e 100644 (file)
--- a/arch/arm64/include/asm/io.h
+++ b/arch/arm64/include/asm/io.h
@@ -34,7 +34,7 @@ static inline void __raw_writew(u16 val, volatile void __iomem *addr)
  }
  
  #define __raw_writel __raw_writel
-static inline void __raw_writel(u32 val, volatile void __iomem *addr)
+static __always_inline void __raw_writel(u32 val, volatile void __iomem *addr)
  {
         asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr));
  }
@@ -69,7 +69,7 @@ static inline u16 __raw_readw(const volatile void __iomem *addr)
  }
  
  #define __raw_readl __raw_readl
-static inline u32 __raw_readl(const volatile void __iomem *addr)
+static __always_inline u32 __raw_readl(const volatile void __iomem *addr)
  {
         u32 val;
         asm volatile(ALTERNATIVE("ldr %w0, [%1]",
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h

index 688c63412cc27922aa2b5faba8d8fc808bef674b..f658dda123645f53462586cf9b4798e8349177cb 100644 (file)
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -36,7 +36,7 @@ void kvm_inject_undef32(struct kvm_vcpu *vcpu);
  void kvm_inject_dabt32(struct kvm_vcpu *vcpu, unsigned long addr);
  void kvm_inject_pabt32(struct kvm_vcpu *vcpu, unsigned long addr);
  
-static inline bool vcpu_el1_is_32bit(struct kvm_vcpu *vcpu)
+static __always_inline bool vcpu_el1_is_32bit(struct kvm_vcpu *vcpu)
  {
         return !(vcpu->arch.hcr_el2 & HCR_RW);
  }
@@ -127,7 +127,7 @@ static inline void vcpu_set_vsesr(struct kvm_vcpu *vcpu, u64 vsesr)
         vcpu->arch.vsesr_el2 = vsesr;
  }
  
-static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu)
+static __always_inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu)
  {
         return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pc;
  }
@@ -153,17 +153,17 @@ static inline void vcpu_write_elr_el1(const struct kvm_vcpu *vcpu, unsigned long
                 *__vcpu_elr_el1(vcpu) = v;
  }
  
-static inline unsigned long *vcpu_cpsr(const struct kvm_vcpu *vcpu)
+static __always_inline unsigned long *vcpu_cpsr(const struct kvm_vcpu *vcpu)
  {
         return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pstate;
  }
  
-static inline bool vcpu_mode_is_32bit(const struct kvm_vcpu *vcpu)
+static __always_inline bool vcpu_mode_is_32bit(const struct kvm_vcpu *vcpu)
  {
         return !!(*vcpu_cpsr(vcpu) & PSR_MODE32_BIT);
  }
  
-static inline bool kvm_condition_valid(const struct kvm_vcpu *vcpu)
+static __always_inline bool kvm_condition_valid(const struct kvm_vcpu *vcpu)
  {
         if (vcpu_mode_is_32bit(vcpu))
                 return kvm_condition_valid32(vcpu);
@@ -181,13 +181,13 @@ static inline void vcpu_set_thumb(struct kvm_vcpu *vcpu)
   * coming from a read of ESR_EL2. Otherwise, it may give the wrong result on
   * AArch32 with banked registers.
   */
-static inline unsigned long vcpu_get_reg(const struct kvm_vcpu *vcpu,
+static __always_inline unsigned long vcpu_get_reg(const struct kvm_vcpu *vcpu,
                                          u8 reg_num)
  {
         return (reg_num == 31) ? 0 : vcpu_gp_regs(vcpu)->regs.regs[reg_num];
  }
  
-static inline void vcpu_set_reg(struct kvm_vcpu *vcpu, u8 reg_num,
+static __always_inline void vcpu_set_reg(struct kvm_vcpu *vcpu, u8 reg_num,
                                 unsigned long val)
  {
         if (reg_num != 31)
@@ -264,12 +264,12 @@ static inline bool vcpu_mode_priv(const struct kvm_vcpu *vcpu)
         return mode != PSR_MODE_EL0t;
  }
  
-static inline u32 kvm_vcpu_get_hsr(const struct kvm_vcpu *vcpu)
+static __always_inline u32 kvm_vcpu_get_hsr(const struct kvm_vcpu *vcpu)
  {
         return vcpu->arch.fault.esr_el2;
  }
  
-static inline int kvm_vcpu_get_condition(const struct kvm_vcpu *vcpu)
+static __always_inline int kvm_vcpu_get_condition(const struct kvm_vcpu *vcpu)
  {
         u32 esr = kvm_vcpu_get_hsr(vcpu);
  
@@ -279,12 +279,12 @@ static inline int kvm_vcpu_get_condition(const struct kvm_vcpu *vcpu)
         return -1;
  }
  
-static inline unsigned long kvm_vcpu_get_hfar(const struct kvm_vcpu *vcpu)
+static __always_inline unsigned long kvm_vcpu_get_hfar(const struct kvm_vcpu *vcpu)
  {
         return vcpu->arch.fault.far_el2;
  }
  
-static inline phys_addr_t kvm_vcpu_get_fault_ipa(const struct kvm_vcpu *vcpu)
+static __always_inline phys_addr_t kvm_vcpu_get_fault_ipa(const struct kvm_vcpu *vcpu)
  {
         return ((phys_addr_t)vcpu->arch.fault.hpfar_el2 & HPFAR_MASK) << 8;
  }
@@ -299,7 +299,7 @@ static inline u32 kvm_vcpu_hvc_get_imm(const struct kvm_vcpu *vcpu)
         return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_xVC_IMM_MASK;
  }
  
-static inline bool kvm_vcpu_dabt_isvalid(const struct kvm_vcpu *vcpu)
+static __always_inline bool kvm_vcpu_dabt_isvalid(const struct kvm_vcpu *vcpu)
  {
         return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_ISV);
  }
@@ -319,17 +319,17 @@ static inline bool kvm_vcpu_dabt_issf(const struct kvm_vcpu *vcpu)
         return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_SF);
  }
  
-static inline int kvm_vcpu_dabt_get_rd(const struct kvm_vcpu *vcpu)
+static __always_inline int kvm_vcpu_dabt_get_rd(const struct kvm_vcpu *vcpu)
  {
         return (kvm_vcpu_get_hsr(vcpu) & ESR_ELx_SRT_MASK) >> ESR_ELx_SRT_SHIFT;
  }
  
-static inline bool kvm_vcpu_dabt_iss1tw(const struct kvm_vcpu *vcpu)
+static __always_inline bool kvm_vcpu_dabt_iss1tw(const struct kvm_vcpu *vcpu)
  {
         return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_S1PTW);
  }
  
-static inline bool kvm_vcpu_dabt_iswrite(const struct kvm_vcpu *vcpu)
+static __always_inline bool kvm_vcpu_dabt_iswrite(const struct kvm_vcpu *vcpu)
  {
         return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_WNR) ||
                 kvm_vcpu_dabt_iss1tw(vcpu); /* AF/DBM update */
@@ -340,18 +340,18 @@ static inline bool kvm_vcpu_dabt_is_cm(const struct kvm_vcpu *vcpu)
         return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_CM);
  }
  
-static inline unsigned int kvm_vcpu_dabt_get_as(const struct kvm_vcpu *vcpu)
+static __always_inline unsigned int kvm_vcpu_dabt_get_as(const struct kvm_vcpu *vcpu)
  {
         return 1 << ((kvm_vcpu_get_hsr(vcpu) & ESR_ELx_SAS) >> ESR_ELx_SAS_SHIFT);
  }
  
  /* This one is not specific to Data Abort */
-static inline bool kvm_vcpu_trap_il_is32bit(const struct kvm_vcpu *vcpu)
+static __always_inline bool kvm_vcpu_trap_il_is32bit(const struct kvm_vcpu *vcpu)
  {
         return !!(kvm_vcpu_get_hsr(vcpu) & ESR_ELx_IL);
  }
  
-static inline u8 kvm_vcpu_trap_get_class(const struct kvm_vcpu *vcpu)
+static __always_inline u8 kvm_vcpu_trap_get_class(const struct kvm_vcpu *vcpu)
  {
         return ESR_ELx_EC(kvm_vcpu_get_hsr(vcpu));
  }
@@ -361,17 +361,17 @@ static inline bool kvm_vcpu_trap_is_iabt(const struct kvm_vcpu *vcpu)
         return kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_IABT_LOW;
  }
  
-static inline u8 kvm_vcpu_trap_get_fault(const struct kvm_vcpu *vcpu)
+static __always_inline u8 kvm_vcpu_trap_get_fault(const struct kvm_vcpu *vcpu)
  {
         return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_FSC;
  }
  
-static inline u8 kvm_vcpu_trap_get_fault_type(const struct kvm_vcpu *vcpu)
+static __always_inline u8 kvm_vcpu_trap_get_fault_type(const struct kvm_vcpu *vcpu)
  {
         return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_FSC_TYPE;
  }
  
-static inline bool kvm_vcpu_dabt_isextabt(const struct kvm_vcpu *vcpu)
+static __always_inline bool kvm_vcpu_dabt_isextabt(const struct kvm_vcpu *vcpu)
  {
         switch (kvm_vcpu_trap_get_fault(vcpu)) {
         case FSC_SEA:
@@ -390,7 +390,7 @@ static inline bool kvm_vcpu_dabt_isextabt(const struct kvm_vcpu *vcpu)
         }
  }
  
-static inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu)
+static __always_inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu)
  {
         u32 esr = kvm_vcpu_get_hsr(vcpu);
         return ESR_ELx_SYS64_ISS_RT(esr);
@@ -504,7 +504,7 @@ static inline unsigned long vcpu_data_host_to_guest(struct kvm_vcpu *vcpu,
         return data;            /* Leave LE untouched */
  }
  
-static inline void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr)
+static __always_inline void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr)
  {
         if (vcpu_mode_is_32bit(vcpu))
                 kvm_skip_instr32(vcpu, is_wide_instr);
@@ -519,7 +519,7 @@ static inline void kvm_skip_instr(struct kvm_vcpu *vcpu, bool is_wide_instr)
   * Skip an instruction which has been emulated at hyp while most guest sysregs
   * are live.
   */
-static inline void __hyp_text __kvm_skip_instr(struct kvm_vcpu *vcpu)
+static __always_inline void __hyp_text __kvm_skip_instr(struct kvm_vcpu *vcpu)
  {
         *vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
         vcpu->arch.ctxt.gp_regs.regs.pstate = read_sysreg_el2(SYS_SPSR);
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h

index d87aa609d2b6f3e49a5b44ccd523b89753289846..57fd46acd05823ed650a34bf16d7412013f7607c 100644 (file)
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -626,38 +626,6 @@ static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {}
  static inline void kvm_clr_pmu_events(u32 clr) {}
  #endif
  
-static inline void kvm_arm_vhe_guest_enter(void)
-{
-       local_daif_mask();
-
-       /*
-        * Having IRQs masked via PMR when entering the guest means the GIC
-        * will not signal the CPU of interrupts of lower priority, and the
-        * only way to get out will be via guest exceptions.
-        * Naturally, we want to avoid this.
-        *
-        * local_daif_mask() already sets GIC_PRIO_PSR_I_SET, we just need a
-        * dsb to ensure the redistributor is forwards EL2 IRQs to the CPU.
-        */
-       pmr_sync();
-}
-
-static inline void kvm_arm_vhe_guest_exit(void)
-{
-       /*
-        * local_daif_restore() takes care to properly restore PSTATE.DAIF
-        * and the GIC PMR if the host is using IRQ priorities.
-        */
-       local_daif_restore(DAIF_PROCCTX_NOIRQ);
-
-       /*
-        * When we exit from the guest we change a number of CPU configuration
-        * parameters, such as traps.  Make sure these changes take effect
-        * before running the host or additional guests.
-        */
-       isb();
-}
-
  #define KVM_BP_HARDEN_UNKNOWN          -1
  #define KVM_BP_HARDEN_WA_NEEDED                0
  #define KVM_BP_HARDEN_NOT_REQUIRED     1
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h

index a3a6a2ba9a635efd7feec4542f4c0b281052521c..fe57f60f06a8944fff74b2e3f1ed5de6d1790626 100644 (file)
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -47,6 +47,13 @@
  #define read_sysreg_el2(r)     read_sysreg_elx(r, _EL2, _EL1)
  #define write_sysreg_el2(v,r)  write_sysreg_elx(v, r, _EL2, _EL1)
  
+/*
+ * Without an __arch_swab32(), we fall back to ___constant_swab32(), but the
+ * static inline can allow the compiler to out-of-line this. KVM always wants
+ * the macro version as its always inlined.
+ */
+#define __kvm_swab32(x)        ___constant_swab32(x)
+
  int __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu);
  
  void __vgic_v3_save_state(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h

index 53d846f1bfe70497b81d2457c9a6eefe543fc61f..785762860c63fd5c435f575cffdf22f2c69d8b27 100644 (file)
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -93,7 +93,7 @@ void kvm_update_va_mask(struct alt_instr *alt,
                         __le32 *origptr, __le32 *updptr, int nr_inst);
  void kvm_compute_layout(void);
  
-static inline unsigned long __kern_hyp_va(unsigned long v)
+static __always_inline unsigned long __kern_hyp_va(unsigned long v)
  {
         asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n"
                                     "ror %0, %0, #1\n"
@@ -473,6 +473,7 @@ static inline int kvm_write_guest_lock(struct kvm *kvm, gpa_t gpa,
  extern void *__kvm_bp_vect_base;
  extern int __kvm_harden_el2_vector_slot;
  
+/*  This is only called on a VHE system */
  static inline void *kvm_get_hyp_vector(void)
  {
         struct bp_hardening_data *data = arm64_get_bp_hardening_data();
diff --git a/arch/arm64/include/asm/lse.h b/arch/arm64/include/asm/lse.h

index d429f7701c3670101b62540f79c65ec711cba4d1..5d10051c3e62e839808d9ee0a56e0274d934589a 100644 (file)
--- a/arch/arm64/include/asm/lse.h
+++ b/arch/arm64/include/asm/lse.h
@@ -6,7 +6,7 @@
  
  #ifdef CONFIG_ARM64_LSE_ATOMICS
  
-#define __LSE_PREAMBLE ".arch armv8-a+lse\n"
+#define __LSE_PREAMBLE ".arch_extension lse\n"
  
  #include <linux/compiler_types.h>
  #include <linux/export.h>
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h

index a4f9ca5479b063a012ec400c5ee623594d44e693..4d94676e5a8b6b08e0043b09bf7728ec1eb15642 100644 (file)
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -213,7 +213,7 @@ static inline unsigned long kaslr_offset(void)
         ((__force __typeof__(addr))sign_extend64((__force u64)(addr), 55))
  
  #define untagged_addr(addr)    ({                                      \
-       u64 __addr = (__force u64)addr;                                 \
+       u64 __addr = (__force u64)(addr);                                       \
         __addr &= __untagged_addr(__addr);                              \
         (__force __typeof__(addr))__addr;                               \
  })
diff --git a/arch/arm64/include/asm/spinlock.h b/arch/arm64/include/asm/spinlock.h

index 102404dc1e135d3e533cbf4549cb8db4a858b379..9083d6992603e6e26aa3ad120bf3fd6f9cc4720a 100644 (file)
--- a/arch/arm64/include/asm/spinlock.h
+++ b/arch/arm64/include/asm/spinlock.h
@@ -18,6 +18,10 @@
   * See:
   * https://lore.kernel.org/lkml/20200110100612.GC2827@hirez.programming.kicks-ass.net
   */
-#define vcpu_is_preempted(cpu) false
+#define vcpu_is_preempted vcpu_is_preempted
+static inline bool vcpu_is_preempted(int cpu)
+{
+       return false;
+}
  
  #endif /* __ASM_SPINLOCK_H */
diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h

index 0958ed6191aa344dcc84c965cc1f1e410a53540f..61fd26752adcb3bf19f55a7d6e3f03b8581f686e 100644 (file)
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -83,7 +83,7 @@ static inline bool is_kernel_in_hyp_mode(void)
         return read_sysreg(CurrentEL) == CurrentEL_EL2;
  }
  
-static inline bool has_vhe(void)
+static __always_inline bool has_vhe(void)
  {
         if (cpus_have_const_cap(ARM64_HAS_VIRT_HOST_EXTN))
                 return true;
diff --git a/arch/arm64/kernel/kaslr.c b/arch/arm64/kernel/kaslr.c

index 53b8a4ee64ff0cb68f959b4562d0ae0b64c14ac5..91a83104c6e8a37848d534872d6642812b40daba 100644 (file)
--- a/arch/arm64/kernel/kaslr.c
+++ b/arch/arm64/kernel/kaslr.c
@@ -11,6 +11,7 @@
  #include <linux/sched.h>
  #include <linux/types.h>
  
+#include <asm/archrandom.h>
  #include <asm/cacheflush.h>
  #include <asm/fixmap.h>
  #include <asm/kernel-pgtable.h>
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c

index bbb0f0c145f6f5e4201ec728064fe992b9ee801c..00626057a384a5c38330f20d0c5d11148c4f3fd2 100644 (file)
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -466,6 +466,13 @@ static void ssbs_thread_switch(struct task_struct *next)
         if (unlikely(next->flags & PF_KTHREAD))
                 return;
  
+       /*
+        * If all CPUs implement the SSBS extension, then we just need to
+        * context-switch the PSTATE field.
+        */
+       if (cpu_have_feature(cpu_feature(SSBS)))
+               return;
+
         /* If the mitigation is enabled, then we leave SSBS clear. */
         if ((arm64_get_ssbd_state() == ARM64_SSBD_FORCE_ENABLE) ||
             test_tsk_thread_flag(next, TIF_SSBD))
@@ -608,8 +615,6 @@ long get_tagged_addr_ctrl(void)
   * only prevents the tagged address ABI enabling via prctl() and does not
   * disable it for tasks that already opted in to the relaxed ABI.
   */
-static int zero;
-static int one = 1;
  
  static struct ctl_table tagged_addr_sysctl_table[] = {
         {
@@ -618,8 +623,8 @@ static struct ctl_table tagged_addr_sysctl_table[] = {
                 .data           = &tagged_addr_disabled,
                 .maxlen         = sizeof(int),
                 .proc_handler   = proc_dointvec_minmax,
-               .extra1         = &zero,
-               .extra2         = &one,
+               .extra1         = SYSCTL_ZERO,
+               .extra2         = SYSCTL_ONE,
         },
         { }
  };
diff --git a/arch/arm64/kernel/time.c b/arch/arm64/kernel/time.c

index 73f06d4b3aae5317267ae372e1128d78f5b53fce..eebbc8d7123e08f1571b6b6dbfa9d74adc5c16c7 100644 (file)
--- a/arch/arm64/kernel/time.c
+++ b/arch/arm64/kernel/time.c
@@ -23,7 +23,7 @@
  #include <linux/irq.h>
  #include <linux/delay.h>
  #include <linux/clocksource.h>
-#include <linux/clk-provider.h>
+#include <linux/of_clk.h>
  #include <linux/acpi.h>
  
  #include <clocksource/arm_arch_timer.h>
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c

index dfe8dd1725128405a1661a946782d812695d2d15..925086b46136f3cefd629ce28a9995a21bf38b71 100644 (file)
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -625,7 +625,7 @@ static void __hyp_text __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
  }
  
  /* Switch to the guest for VHE systems running in EL2 */
-int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
+static int __kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
  {
         struct kvm_cpu_context *host_ctxt;
         struct kvm_cpu_context *guest_ctxt;
@@ -678,7 +678,42 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
  
         return exit_code;
  }
-NOKPROBE_SYMBOL(kvm_vcpu_run_vhe);
+NOKPROBE_SYMBOL(__kvm_vcpu_run_vhe);
+
+int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
+{
+       int ret;
+
+       local_daif_mask();
+
+       /*
+        * Having IRQs masked via PMR when entering the guest means the GIC
+        * will not signal the CPU of interrupts of lower priority, and the
+        * only way to get out will be via guest exceptions.
+        * Naturally, we want to avoid this.
+        *
+        * local_daif_mask() already sets GIC_PRIO_PSR_I_SET, we just need a
+        * dsb to ensure the redistributor is forwards EL2 IRQs to the CPU.
+        */
+       pmr_sync();
+
+       ret = __kvm_vcpu_run_vhe(vcpu);
+
+       /*
+        * local_daif_restore() takes care to properly restore PSTATE.DAIF
+        * and the GIC PMR if the host is using IRQ priorities.
+        */
+       local_daif_restore(DAIF_PROCCTX_NOIRQ);
+
+       /*
+        * When we exit from the guest we change a number of CPU configuration
+        * parameters, such as traps.  Make sure these changes take effect
+        * before running the host or additional guests.
+        */
+       isb();
+
+       return ret;
+}
  
  /* Switch to the guest for legacy non-VHE systems */
  int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
diff --git a/arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c b/arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c

index 29ee1feba4eb7de12d55d8c8a99f64e2577e73e5..4f3a087e36d51c6e53d2fedc9370b1edcdbca0c9 100644 (file)
--- a/arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c
+++ b/arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c
@@ -69,14 +69,14 @@ int __hyp_text __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu)
                 u32 data = vcpu_get_reg(vcpu, rd);
                 if (__is_be(vcpu)) {
                         /* guest pre-swabbed data, undo this for writel() */
-                       data = swab32(data);
+                       data = __kvm_swab32(data);
                 }
                 writel_relaxed(data, addr);
         } else {
                 u32 data = readl_relaxed(addr);
                 if (__is_be(vcpu)) {
                         /* guest expects swabbed data */
-                       data = swab32(data);
+                       data = __kvm_swab32(data);
                 }
                 vcpu_set_reg(vcpu, rd, data);
         }
diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig

index da09c884cc305f20ba912c3b1ecd3b9ea54f8bbc..047427f71d835a6022023c7d69e7a42864d9caac 100644 (file)
--- a/arch/csky/Kconfig
+++ b/arch/csky/Kconfig
@@ -9,7 +9,6 @@ config CSKY
         select ARCH_USE_QUEUED_RWLOCKS if NR_CPUS>2
         select COMMON_CLK
         select CLKSRC_MMIO
-       select CLKSRC_OF
         select CSKY_MPINTC if CPU_CK860
         select CSKY_MP_TIMER if CPU_CK860
         select CSKY_APB_INTC
@@ -37,6 +36,7 @@ config CSKY
         select GX6605S_TIMER if CPU_CK610
         select HAVE_ARCH_TRACEHOOK
         select HAVE_ARCH_AUDITSYSCALL
+       select HAVE_COPY_THREAD_TLS
         select HAVE_DYNAMIC_FTRACE
         select HAVE_FUNCTION_TRACER
         select HAVE_FUNCTION_GRAPH_TRACER
@@ -47,8 +47,8 @@ config CSKY
         select HAVE_PERF_EVENTS
         select HAVE_PERF_REGS
         select HAVE_PERF_USER_STACK_DUMP
-       select HAVE_DMA_API_DEBUG
         select HAVE_DMA_CONTIGUOUS
+       select HAVE_STACKPROTECTOR
         select HAVE_SYSCALL_TRACEPOINTS
         select MAY_HAVE_SPARSE_IRQ
         select MODULES_USE_ELF_RELA if MODULES
@@ -59,6 +59,11 @@ config CSKY
         select TIMER_OF
         select USB_ARCH_HAS_EHCI
         select USB_ARCH_HAS_OHCI
+       select GENERIC_PCI_IOMAP
+       select HAVE_PCI
+       select PCI_DOMAINS_GENERIC if PCI
+       select PCI_SYSCALL if PCI
+       select PCI_MSI if PCI
  
  config CPU_HAS_CACHEV2
         bool
@@ -75,7 +80,7 @@ config CPU_HAS_TLBI
  config CPU_HAS_LDSTEX
         bool
         help
-         For SMP, CPU needs "ldex&stex" instrcutions to atomic operations.
+         For SMP, CPU needs "ldex&stex" instructions for atomic operations.
  
  config CPU_NEED_TLBSYNC
         bool
@@ -188,6 +193,40 @@ config CPU_PM_STOP
         bool "stop"
  endchoice
  
+menuconfig HAVE_TCM
+       bool "Tightly-Coupled/Sram Memory"
+       select GENERIC_ALLOCATOR
+       help
+         The implementation are not only used by TCM (Tightly-Coupled Meory)
+         but also used by sram on SOC bus. It follow existed linux tcm
+         software interface, so that old tcm application codes could be
+         re-used directly.
+
+if HAVE_TCM
+config ITCM_RAM_BASE
+       hex "ITCM ram base"
+       default 0xffffffff
+
+config ITCM_NR_PAGES
+       int "Page count of ITCM size: NR*4KB"
+       range 1 256
+       default 32
+
+config HAVE_DTCM
+       bool "DTCM Support"
+
+config DTCM_RAM_BASE
+       hex "DTCM ram base"
+       depends on HAVE_DTCM
+       default 0xffffffff
+
+config DTCM_NR_PAGES
+       int "Page count of DTCM size: NR*4KB"
+       depends on HAVE_DTCM
+       range 1 256
+       default 32
+endif
+
  config CPU_HAS_VDSP
         bool "CPU has VDSP coprocessor"
         depends on CPU_HAS_FPU && CPU_HAS_FPUV2
@@ -196,6 +235,10 @@ config CPU_HAS_FPU
         bool "CPU has FPU coprocessor"
         depends on CPU_CK807 || CPU_CK810 || CPU_CK860
  
+config CPU_HAS_ICACHE_INS
+       bool "CPU has Icache invalidate instructions"
+       depends on CPU_HAS_CACHEV2
+
  config CPU_HAS_TEE
         bool "CPU has Trusted Execution Environment"
         depends on CPU_CK810
@@ -235,4 +278,6 @@ config HOTPLUG_CPU
           Say N if you want to disable CPU hotplug.
  endmenu
  
+source "arch/csky/Kconfig.platforms"
+
  source "kernel/Kconfig.hz"
diff --git a/arch/csky/Kconfig.platforms b/arch/csky/Kconfig.platforms

new file mode 100644 (file)

index 0000000..639e17f
--- /dev/null
+++ b/arch/csky/Kconfig.platforms
@@ -0,0 +1,9 @@
+menu "Platform drivers selection"
+
+config ARCH_CSKY_DW_APB_ICTL
+       bool "Select dw-apb interrupt controller"
+       select DW_APB_ICTL
+       default y
+       help
+         This enables support for snps dw-apb-ictl
+endmenu
diff --git a/arch/csky/abiv1/inc/abi/cacheflush.h b/arch/csky/abiv1/inc/abi/cacheflush.h

index 79ef9e8c1afddc1a520fb56cd2817be38e4cae4c..d3e04208d53c23e36eee552e89e484ef81d8f2c5 100644 (file)
--- a/arch/csky/abiv1/inc/abi/cacheflush.h
+++ b/arch/csky/abiv1/inc/abi/cacheflush.h
@@ -48,9 +48,8 @@ extern void flush_cache_range(struct vm_area_struct *vma, unsigned long start, u
  
  #define flush_icache_page(vma, page)           do {} while (0);
  #define flush_icache_range(start, end)         cache_wbinv_range(start, end)
-
-#define flush_icache_user_range(vma,page,addr,len) \
-       flush_dcache_page(page)
+#define flush_icache_mm_range(mm, start, end)  cache_wbinv_range(start, end)
+#define flush_icache_deferred(mm)              do {} while (0);
  
  #define copy_from_user_page(vma, page, vaddr, dst, src, len) \
  do { \
diff --git a/arch/csky/abiv1/inc/abi/entry.h b/arch/csky/abiv1/inc/abi/entry.h

index 7ab78bd0f3b13f7e52be13445314ef956566fa73..f35a9f3315ee6f62b126eb54b46f72a90d22da41 100644 (file)
--- a/arch/csky/abiv1/inc/abi/entry.h
+++ b/arch/csky/abiv1/inc/abi/entry.h
@@ -16,14 +16,16 @@
  #define LSAVE_A4       40
  #define LSAVE_A5       44
  
+#define usp ss1
+
  .macro USPTOKSP
-       mtcr    sp, ss1
+       mtcr    sp, usp
         mfcr    sp, ss0
  .endm
  
  .macro KSPTOUSP
         mtcr    sp, ss0
-       mfcr    sp, ss1
+       mfcr    sp, usp
  .endm
  
  .macro SAVE_ALL epc_inc
@@ -45,7 +47,13 @@
         add     lr, r13
         stw     lr, (sp, 8)
  
+       mov     lr, sp
+       addi    lr, 32
+       addi    lr, 32
+       addi    lr, 16
+       bt      2f
         mfcr    lr, ss1
+2:
         stw     lr, (sp, 16)
  
         stw     a0, (sp, 20)
@@ -79,9 +87,10 @@
         ldw     a0, (sp, 12)
         mtcr    a0, epsr
         btsti   a0, 31
+       bt      1f
         ldw     a0, (sp, 16)
         mtcr    a0, ss1
-
+1:
         ldw     a0, (sp, 24)
         ldw     a1, (sp, 28)
         ldw     a2, (sp, 32)
@@ -102,9 +111,9 @@
         addi    sp, 32
         addi    sp, 8
  
-       bt      1f
+       bt      2f
         KSPTOUSP
-1:
+2:
         rte
  .endm
  
diff --git a/arch/csky/abiv2/cacheflush.c b/arch/csky/abiv2/cacheflush.c

index 5bb887b275e1213e4e6e9e77b894a7ef490d06c7..790f1ebfba44ba925ccd94f1927b61e03137dbc6 100644 (file)
--- a/arch/csky/abiv2/cacheflush.c
+++ b/arch/csky/abiv2/cacheflush.c
@@ -6,46 +6,80 @@
  #include <linux/mm.h>
  #include <asm/cache.h>
  
-void flush_icache_page(struct vm_area_struct *vma, struct page *page)
+void update_mmu_cache(struct vm_area_struct *vma, unsigned long address,
+                     pte_t *pte)
  {
-       unsigned long start;
+       unsigned long addr;
+       struct page *page;
  
-       start = (unsigned long) kmap_atomic(page);
+       page = pfn_to_page(pte_pfn(*pte));
+       if (page == ZERO_PAGE(0))
+               return;
  
-       cache_wbinv_range(start, start + PAGE_SIZE);
+       if (test_and_set_bit(PG_dcache_clean, &page->flags))
+               return;
  
-       kunmap_atomic((void *)start);
-}
+       addr = (unsigned long) kmap_atomic(page);
  
-void flush_icache_user_range(struct vm_area_struct *vma, struct page *page,
-                            unsigned long vaddr, int len)
-{
-       unsigned long kaddr;
+       dcache_wb_range(addr, addr + PAGE_SIZE);
  
-       kaddr = (unsigned long) kmap_atomic(page) + (vaddr & ~PAGE_MASK);
+       if (vma->vm_flags & VM_EXEC)
+               icache_inv_range(addr, addr + PAGE_SIZE);
+
+       kunmap_atomic((void *) addr);
+}
  
-       cache_wbinv_range(kaddr, kaddr + len);
+void flush_icache_deferred(struct mm_struct *mm)
+{
+       unsigned int cpu = smp_processor_id();
+       cpumask_t *mask = &mm->context.icache_stale_mask;
  
-       kunmap_atomic((void *)kaddr);
+       if (cpumask_test_cpu(cpu, mask)) {
+               cpumask_clear_cpu(cpu, mask);
+               /*
+                * Ensure the remote hart's writes are visible to this hart.
+                * This pairs with a barrier in flush_icache_mm.
+                */
+               smp_mb();
+               local_icache_inv_all(NULL);
+       }
  }
  
-void update_mmu_cache(struct vm_area_struct *vma, unsigned long address,
-                     pte_t *pte)
+void flush_icache_mm_range(struct mm_struct *mm,
+               unsigned long start, unsigned long end)
  {
-       unsigned long addr, pfn;
-       struct page *page;
+       unsigned int cpu;
+       cpumask_t others, *mask;
  
-       pfn = pte_pfn(*pte);
-       if (unlikely(!pfn_valid(pfn)))
-               return;
+       preempt_disable();
  
-       page = pfn_to_page(pfn);
-       if (page == ZERO_PAGE(0))
+#ifdef CONFIG_CPU_HAS_ICACHE_INS
+       if (mm == current->mm) {
+               icache_inv_range(start, end);
+               preempt_enable();
                 return;
+       }
+#endif
  
-       addr = (unsigned long) kmap_atomic(page);
+       /* Mark every hart's icache as needing a flush for this MM. */
+       mask = &mm->context.icache_stale_mask;
+       cpumask_setall(mask);
  
-       cache_wbinv_range(addr, addr + PAGE_SIZE);
+       /* Flush this hart's I$ now, and mark it as flushed. */
+       cpu = smp_processor_id();
+       cpumask_clear_cpu(cpu, mask);
+       local_icache_inv_all(NULL);
  
-       kunmap_atomic((void *) addr);
+       /*
+        * Flush the I$ of other harts concurrently executing, and mark them as
+        * flushed.
+        */
+       cpumask_andnot(&others, mm_cpumask(mm), cpumask_of(cpu));
+
+       if (mm != current->active_mm || !cpumask_empty(&others)) {
+               on_each_cpu_mask(&others, local_icache_inv_all, NULL, 1);
+               cpumask_clear(mask);
+       }
+
+       preempt_enable();
  }
diff --git a/arch/csky/abiv2/inc/abi/cacheflush.h b/arch/csky/abiv2/inc/abi/cacheflush.h

index b8db5e0b2fe3c6448907398cba030c7d9abb9223..a565e00c3f70b2e51420cf61650c59cf68c199dc 100644 (file)
--- a/arch/csky/abiv2/inc/abi/cacheflush.h
+++ b/arch/csky/abiv2/inc/abi/cacheflush.h
@@ -13,24 +13,27 @@
  #define flush_cache_all()                      do { } while (0)
  #define flush_cache_mm(mm)                     do { } while (0)
  #define flush_cache_dup_mm(mm)                 do { } while (0)
+#define flush_cache_range(vma, start, end)     do { } while (0)
+#define flush_cache_page(vma, vmaddr, pfn)     do { } while (0)
  
-#define flush_cache_range(vma, start, end) \
-       do { \
-               if (vma->vm_flags & VM_EXEC) \
-                       icache_inv_all(); \
-       } while (0)
+#define PG_dcache_clean                PG_arch_1
+
+#define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
+static inline void flush_dcache_page(struct page *page)
+{
+       if (test_bit(PG_dcache_clean, &page->flags))
+               clear_bit(PG_dcache_clean, &page->flags);
+}
  
-#define flush_cache_page(vma, vmaddr, pfn)     do { } while (0)
-#define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 0
-#define flush_dcache_page(page)                        do { } while (0)
  #define flush_dcache_mmap_lock(mapping)                do { } while (0)
  #define flush_dcache_mmap_unlock(mapping)      do { } while (0)
+#define flush_icache_page(vma, page)           do { } while (0)
  
  #define flush_icache_range(start, end)         cache_wbinv_range(start, end)
  
-void flush_icache_page(struct vm_area_struct *vma, struct page *page);
-void flush_icache_user_range(struct vm_area_struct *vma, struct page *page,
-                            unsigned long vaddr, int len);
+void flush_icache_mm_range(struct mm_struct *mm,
+                       unsigned long start, unsigned long end);
+void flush_icache_deferred(struct mm_struct *mm);
  
  #define flush_cache_vmap(start, end)           do { } while (0)
  #define flush_cache_vunmap(start, end)         do { } while (0)
@@ -38,7 +41,13 @@ void flush_icache_user_range(struct vm_area_struct *vma, struct page *page,
  #define copy_to_user_page(vma, page, vaddr, dst, src, len) \
  do { \
         memcpy(dst, src, len); \
-       cache_wbinv_range((unsigned long)dst, (unsigned long)dst + len); \
+       if (vma->vm_flags & VM_EXEC) { \
+               dcache_wb_range((unsigned long)dst, \
+                               (unsigned long)dst + len); \
+               flush_icache_mm_range(current->mm, \
+                               (unsigned long)dst, \
+                               (unsigned long)dst + len); \
+               } \
  } while (0)
  #define copy_from_user_page(vma, page, vaddr, dst, src, len) \
         memcpy(dst, src, len)
diff --git a/arch/csky/abiv2/inc/abi/entry.h b/arch/csky/abiv2/inc/abi/entry.h

index 9897a16b45e5dcc75b4e6638a8ecd1153be661e9..94a7a58765dffe6a0b28bb1acb36bf6c380fbbb6 100644 (file)
--- a/arch/csky/abiv2/inc/abi/entry.h
+++ b/arch/csky/abiv2/inc/abi/entry.h
@@ -31,7 +31,13 @@
  
         mfcr    lr, epsr
         stw     lr, (sp, 12)
+       btsti   lr, 31
+       bf      1f
+       addi    lr, sp, 152
+       br      2f
+1:
         mfcr    lr, usp
+2:
         stw     lr, (sp, 16)
  
         stw     a0, (sp, 20)
@@ -64,8 +70,10 @@
         mtcr    a0, epc
         ldw     a0, (sp, 12)
         mtcr    a0, epsr
+       btsti   a0, 31
         ldw     a0, (sp, 16)
         mtcr    a0, usp
+       mtcr    a0, ss0
  
  #ifdef CONFIG_CPU_HAS_HILO
         ldw     a0, (sp, 140)
@@ -86,6 +94,9 @@
         addi    sp, 40
         ldm     r16-r30, (sp)
         addi    sp, 72
+       bf      1f
+       mfcr    sp, ss0
+1:
         rte
  .endm
  
diff --git a/arch/csky/configs/defconfig b/arch/csky/configs/defconfig

index 7ef42895dfb03b294c77b0f9687054cfe9d94571..af722e4dfb47d8239c969b25834e9b231a9b4244 100644 (file)
--- a/arch/csky/configs/defconfig
+++ b/arch/csky/configs/defconfig
@@ -10,9 +10,6 @@ CONFIG_BSD_PROCESS_ACCT=y
  CONFIG_BSD_PROCESS_ACCT_V3=y
  CONFIG_MODULES=y
  CONFIG_MODULE_UNLOAD=y
-CONFIG_DEFAULT_DEADLINE=y
-CONFIG_CPU_CK807=y
-CONFIG_CPU_HAS_FPU=y
  CONFIG_NET=y
  CONFIG_PACKET=y
  CONFIG_UNIX=y
@@ -27,10 +24,7 @@ CONFIG_SERIAL_NONSTANDARD=y
  CONFIG_SERIAL_8250=y
  CONFIG_SERIAL_8250_CONSOLE=y
  CONFIG_SERIAL_OF_PLATFORM=y
-CONFIG_TTY_PRINTK=y
  # CONFIG_VGA_CONSOLE is not set
-CONFIG_CSKY_MPTIMER=y
-CONFIG_GX6605S_TIMER=y
  CONFIG_PM_DEVFREQ=y
  CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND=y
  CONFIG_DEVFREQ_GOV_PERFORMANCE=y
@@ -56,6 +50,4 @@ CONFIG_CRAMFS=y
  CONFIG_ROMFS_FS=y
  CONFIG_NFS_FS=y
  CONFIG_PRINTK_TIME=y
-CONFIG_DEBUG_INFO=y
-CONFIG_DEBUG_FS=y
  CONFIG_MAGIC_SYSRQ=y
diff --git a/arch/csky/include/asm/Kbuild b/arch/csky/include/asm/Kbuild

index bc15a26c782f9268fc47d1617594881d8d9bec29..4130e3eaa766711685f776b8b22cb234edce5379 100644 (file)
--- a/arch/csky/include/asm/Kbuild
+++ b/arch/csky/include/asm/Kbuild
@@ -28,7 +28,6 @@ generic-y += local64.h
  generic-y += mm-arch-hooks.h
  generic-y += mmiowb.h
  generic-y += module.h
-generic-y += pci.h
  generic-y += percpu.h
  generic-y += preempt.h
  generic-y += qrwlock.h
diff --git a/arch/csky/include/asm/cache.h b/arch/csky/include/asm/cache.h

index 1d5fc2f78fd7e8c634b87e61eb9219fbc4f67ba3..4b5c09bf1d25e3a9b9720bd58f99effcdab09610 100644 (file)
--- a/arch/csky/include/asm/cache.h
+++ b/arch/csky/include/asm/cache.h
@@ -16,6 +16,7 @@ void dcache_wb_line(unsigned long start);
  
  void icache_inv_range(unsigned long start, unsigned long end);
  void icache_inv_all(void);
+void local_icache_inv_all(void *priv);
  
  void dcache_wb_range(unsigned long start, unsigned long end);
  void dcache_wbinv_all(void);
diff --git a/arch/csky/include/asm/cacheflush.h b/arch/csky/include/asm/cacheflush.h

index a96da67261ae51fd67f48291489ef709874ac148..f0b8f25429a27f723c191888f624b53c0d778529 100644 (file)
--- a/arch/csky/include/asm/cacheflush.h
+++ b/arch/csky/include/asm/cacheflush.h
@@ -4,6 +4,7 @@
  #ifndef __ASM_CSKY_CACHEFLUSH_H
  #define __ASM_CSKY_CACHEFLUSH_H
  
+#include <linux/mm.h>
  #include <abi/cacheflush.h>
  
  #endif /* __ASM_CSKY_CACHEFLUSH_H */
diff --git a/arch/csky/include/asm/fixmap.h b/arch/csky/include/asm/fixmap.h

index 380ff0a307df02095f1585d021984a7167007b2b..81f9477d5330c95e2c964294fb782a5aea3aeaf0 100644 (file)
--- a/arch/csky/include/asm/fixmap.h
+++ b/arch/csky/include/asm/fixmap.h
@@ -5,12 +5,16 @@
  #define __ASM_CSKY_FIXMAP_H
  
  #include <asm/page.h>
+#include <asm/memory.h>
  #ifdef CONFIG_HIGHMEM
  #include <linux/threads.h>
  #include <asm/kmap_types.h>
  #endif
  
  enum fixed_addresses {
+#ifdef CONFIG_HAVE_TCM
+       FIX_TCM = TCM_NR_PAGES,
+#endif
  #ifdef CONFIG_HIGHMEM
         FIX_KMAP_BEGIN,
         FIX_KMAP_END = FIX_KMAP_BEGIN + (KM_TYPE_NR * NR_CPUS) - 1,
@@ -18,10 +22,13 @@ enum fixed_addresses {
         __end_of_fixed_addresses
  };
  
-#define FIXADDR_TOP    0xffffc000
  #define FIXADDR_SIZE   (__end_of_fixed_addresses << PAGE_SHIFT)
  #define FIXADDR_START  (FIXADDR_TOP - FIXADDR_SIZE)
  
  #include <asm-generic/fixmap.h>
  
+extern void fixrange_init(unsigned long start, unsigned long end,
+       pgd_t *pgd_base);
+extern void __init fixaddr_init(void);
+
  #endif /* __ASM_CSKY_FIXMAP_H */
diff --git a/arch/csky/include/asm/memory.h b/arch/csky/include/asm/memory.h

new file mode 100644 (file)

index 0000000..a65c675
--- /dev/null
+++ b/arch/csky/include/asm/memory.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ASM_CSKY_MEMORY_H
+#define __ASM_CSKY_MEMORY_H
+
+#include <linux/compiler.h>
+#include <linux/const.h>
+#include <linux/types.h>
+#include <linux/sizes.h>
+
+#define FIXADDR_TOP    _AC(0xffffc000, UL)
+#define PKMAP_BASE     _AC(0xff800000, UL)
+#define VMALLOC_START  _AC(0xc0008000, UL)
+#define VMALLOC_END    (PKMAP_BASE - (PAGE_SIZE * 2))
+
+#ifdef CONFIG_HAVE_TCM
+#ifdef CONFIG_HAVE_DTCM
+#define TCM_NR_PAGES   (CONFIG_ITCM_NR_PAGES + CONFIG_DTCM_NR_PAGES)
+#else
+#define TCM_NR_PAGES   (CONFIG_ITCM_NR_PAGES)
+#endif
+#define FIXADDR_TCM    _AC(FIXADDR_TOP - (TCM_NR_PAGES * PAGE_SIZE), UL)
+#endif
+
+#endif
diff --git a/arch/csky/include/asm/mmu.h b/arch/csky/include/asm/mmu.h

index b382a14ea4ec72535393b4a6962b1f1020ab0b7e..26fbb1d15df08ea711f73efe26b39bc356312cc0 100644 (file)
--- a/arch/csky/include/asm/mmu.h
+++ b/arch/csky/include/asm/mmu.h
@@ -7,6 +7,7 @@
  typedef struct {
         atomic64_t      asid;
         void *vdso;
+       cpumask_t       icache_stale_mask;
  } mm_context_t;
  
  #endif /* __ASM_CSKY_MMU_H */
diff --git a/arch/csky/include/asm/mmu_context.h b/arch/csky/include/asm/mmu_context.h

index 0285b0ad18b6f2bce9276396461f9a1ffc2ed2f2..abdf1f1cb6ec991248ac17d7dc63881c479616ef 100644 (file)
--- a/arch/csky/include/asm/mmu_context.h
+++ b/arch/csky/include/asm/mmu_context.h
@@ -43,5 +43,7 @@ switch_mm(struct mm_struct *prev, struct mm_struct *next,
  
         TLBMISS_HANDLER_SETUP_PGD(next->pgd);
         write_mmu_entryhi(next->context.asid.counter);
+
+       flush_icache_deferred(next);
  }
  #endif /* __ASM_CSKY_MMU_CONTEXT_H */
diff --git a/arch/csky/include/asm/pci.h b/arch/csky/include/asm/pci.h

new file mode 100644 (file)

index 0000000..ebc765b
--- /dev/null
+++ b/arch/csky/include/asm/pci.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef __ASM_CSKY_PCI_H
+#define __ASM_CSKY_PCI_H
+
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/dma-mapping.h>
+
+#include <asm/io.h>
+
+#define PCIBIOS_MIN_IO         0
+#define PCIBIOS_MIN_MEM                0
+
+/* C-SKY shim does not initialize PCI bus */
+#define pcibios_assign_all_busses() 1
+
+extern int isa_dma_bridge_buggy;
+
+#ifdef CONFIG_PCI
+static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
+{
+       /* no legacy IRQ on csky */
+       return -ENODEV;
+}
+
+static inline int pci_proc_domain(struct pci_bus *bus)
+{
+       /* always show the domain in /proc */
+       return 1;
+}
+#endif  /* CONFIG_PCI */
+
+#endif  /* __ASM_CSKY_PCI_H */
diff --git a/arch/csky/include/asm/pgtable.h b/arch/csky/include/asm/pgtable.h

index 4b2a41e15f2e4231b0c1905eb3ec38628359fe33..9b7764cb76450f8d205e0996047d2866ae55d0eb 100644 (file)
--- a/arch/csky/include/asm/pgtable.h
+++ b/arch/csky/include/asm/pgtable.h
@@ -5,6 +5,7 @@
  #define __ASM_CSKY_PGTABLE_H
  
  #include <asm/fixmap.h>
+#include <asm/memory.h>
  #include <asm/addrspace.h>
  #include <abi/pgtable-bits.h>
  #include <asm-generic/pgtable-nopmd.h>
@@ -16,11 +17,6 @@
  #define USER_PTRS_PER_PGD      (0x80000000UL/PGDIR_SIZE)
  #define FIRST_USER_ADDRESS     0UL
  
-#define PKMAP_BASE             (0xff800000)
-
-#define VMALLOC_START          (0xc0008000)
-#define VMALLOC_END            (PKMAP_BASE - 2*PAGE_SIZE)
-
  /*
   * C-SKY is two-level paging structure:
   */
diff --git a/arch/csky/include/asm/stackprotector.h b/arch/csky/include/asm/stackprotector.h

new file mode 100644 (file)

index 0000000..d7cd4e5
--- /dev/null
+++ b/arch/csky/include/asm/stackprotector.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_STACKPROTECTOR_H
+#define _ASM_STACKPROTECTOR_H 1
+
+#include <linux/random.h>
+#include <linux/version.h>
+
+extern unsigned long __stack_chk_guard;
+
+/*
+ * Initialize the stackprotector canary value.
+ *
+ * NOTE: this must only be called from functions that never return,
+ * and it must always be inlined.
+ */
+static __always_inline void boot_init_stack_canary(void)
+{
+       unsigned long canary;
+
+       /* Try to get a semi random initial value. */
+       get_random_bytes(&canary, sizeof(canary));
+       canary ^= LINUX_VERSION_CODE;
+       canary &= CANARY_MASK;
+
+       current->stack_canary = canary;
+       __stack_chk_guard = current->stack_canary;
+}
+
+#endif /* __ASM_SH_STACKPROTECTOR_H */
diff --git a/arch/csky/include/asm/tcm.h b/arch/csky/include/asm/tcm.h

new file mode 100644 (file)

index 0000000..2b135ce
--- /dev/null
+++ b/arch/csky/include/asm/tcm.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ASM_CSKY_TCM_H
+#define __ASM_CSKY_TCM_H
+
+#ifndef CONFIG_HAVE_TCM
+#error "You should not be including tcm.h unless you have a TCM!"
+#endif
+
+#include <linux/compiler.h>
+
+/* Tag variables with this */
+#define __tcmdata __section(.tcm.data)
+/* Tag constants with this */
+#define __tcmconst __section(.tcm.rodata)
+/* Tag functions inside TCM called from outside TCM with this */
+#define __tcmfunc __section(.tcm.text) noinline
+/* Tag function inside TCM called from inside TCM  with this */
+#define __tcmlocalfunc __section(.tcm.text)
+
+void *tcm_alloc(size_t len);
+void tcm_free(void *addr, size_t len);
+
+#endif
diff --git a/arch/csky/include/uapi/asm/unistd.h b/arch/csky/include/uapi/asm/unistd.h

index 211c983c7282d13f3b75150e686c8e815fd7777b..ba40189297338d1de36b9cff54d76f0746b344c6 100644 (file)
--- a/arch/csky/include/uapi/asm/unistd.h
+++ b/arch/csky/include/uapi/asm/unistd.h
@@ -1,7 +1,10 @@
  /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
  // Copyright (C) 2018 Hangzhou C-SKY Microsystems co.,ltd.
  
+#define __ARCH_WANT_STAT64
+#define __ARCH_WANT_NEW_STAT
  #define __ARCH_WANT_SYS_CLONE
+#define __ARCH_WANT_SYS_CLONE3
  #define __ARCH_WANT_SET_GET_RLIMIT
  #define __ARCH_WANT_TIME32_SYSCALLS
  #include <asm-generic/unistd.h>
diff --git a/arch/csky/kernel/atomic.S b/arch/csky/kernel/atomic.S

index 5b84f11485aeb8c6794699c54250cf8c54f5a91d..3821ef9b75672d8a5af90839ffa7f95cfdb4da50 100644 (file)
--- a/arch/csky/kernel/atomic.S
+++ b/arch/csky/kernel/atomic.S
@@ -17,10 +17,12 @@ ENTRY(csky_cmpxchg)
         mfcr    a3, epc
         addi    a3, TRAP0_SIZE
  
-       subi    sp, 8
+       subi    sp, 16
         stw     a3, (sp, 0)
         mfcr    a3, epsr
         stw     a3, (sp, 4)
+       mfcr    a3, usp
+       stw     a3, (sp, 8)
  
         psrset  ee
  #ifdef CONFIG_CPU_HAS_LDSTEX
@@ -47,7 +49,9 @@ ENTRY(csky_cmpxchg)
         mtcr    a3, epc
         ldw     a3, (sp, 4)
         mtcr    a3, epsr
-       addi    sp, 8
+       ldw     a3, (sp, 8)
+       mtcr    a3, usp
+       addi    sp, 16
         KSPTOUSP
         rte
  END(csky_cmpxchg)
diff --git a/arch/csky/kernel/process.c b/arch/csky/kernel/process.c

index f320d9248a225fe31941129ccd2b4225c47f36d1..f7b231ca269a0dbc2b5fd76dbffcb5429fb01c6d 100644 (file)
--- a/arch/csky/kernel/process.c
+++ b/arch/csky/kernel/process.c
@@ -16,6 +16,12 @@
  
  struct cpuinfo_csky cpu_data[NR_CPUS];
  
+#ifdef CONFIG_STACKPROTECTOR
+#include <linux/stackprotector.h>
+unsigned long __stack_chk_guard __read_mostly;
+EXPORT_SYMBOL(__stack_chk_guard);
+#endif
+
  asmlinkage void ret_from_fork(void);
  asmlinkage void ret_from_kernel_thread(void);
  
@@ -34,10 +40,11 @@ unsigned long thread_saved_pc(struct task_struct *tsk)
         return sw->r15;
  }
  
-int copy_thread(unsigned long clone_flags,
+int copy_thread_tls(unsigned long clone_flags,
                 unsigned long usp,
                 unsigned long kthread_arg,
-               struct task_struct *p)
+               struct task_struct *p,
+               unsigned long tls)
  {
         struct switch_stack *childstack;
         struct pt_regs *childregs = task_pt_regs(p);
@@ -64,7 +71,7 @@ int copy_thread(unsigned long clone_flags,
                         childregs->usp = usp;
                 if (clone_flags & CLONE_SETTLS)
                         task_thread_info(p)->tp_value = childregs->tls
-                                                     = childregs->regs[0];
+                                                     = tls;
  
                 childregs->a0 = 0;
                 childstack->r15 = (unsigned long) ret_from_fork;
diff --git a/arch/csky/kernel/setup.c b/arch/csky/kernel/setup.c

index 52eaf31ba27fc95a44f8e4ce377d776110ad1142..3821e55742f46f0070688f1e81161aefb8760f5c 100644 (file)
--- a/arch/csky/kernel/setup.c
+++ b/arch/csky/kernel/setup.c
@@ -47,9 +47,6 @@ static void __init csky_memblock_init(void)
         signed long size;
  
         memblock_reserve(__pa(_stext), _end - _stext);
-#ifdef CONFIG_BLK_DEV_INITRD
-       memblock_reserve(__pa(initrd_start), initrd_end - initrd_start);
-#endif
  
         early_init_fdt_reserve_self();
         early_init_fdt_scan_reserved_mem();
@@ -133,6 +130,8 @@ void __init setup_arch(char **cmdline_p)
  
         sparse_init();
  
+       fixaddr_init();
+
  #ifdef CONFIG_HIGHMEM
         kmap_init();
  #endif
diff --git a/arch/csky/kernel/smp.c b/arch/csky/kernel/smp.c

index b753d382e4cef53f45350e1108518096934d826a..0bb0954d557090c359b7e80bd30957ad0826b196 100644 (file)
--- a/arch/csky/kernel/smp.c
+++ b/arch/csky/kernel/smp.c
@@ -120,7 +120,7 @@ void __init setup_smp_ipi(void)
         int rc;
  
         if (ipi_irq == 0)
-               panic("%s IRQ mapping failed\n", __func__);
+               return;
  
         rc = request_percpu_irq(ipi_irq, handle_ipi, "IPI Interrupt",
                                 &ipi_dummy_dev);
diff --git a/arch/csky/kernel/time.c b/arch/csky/kernel/time.c

index b5fc9447d93f2307f892ec0afa9fbda3e59a0294..52379d866fe45f2839afdcbca2216224fd455d33 100644 (file)
--- a/arch/csky/kernel/time.c
+++ b/arch/csky/kernel/time.c
@@ -1,8 +1,8 @@
  // SPDX-License-Identifier: GPL-2.0
  // Copyright (C) 2018 Hangzhou C-SKY Microsystems co.,ltd.
  
-#include <linux/clk-provider.h>
  #include <linux/clocksource.h>
+#include <linux/of_clk.h>
  
  void __init time_init(void)
  {
diff --git a/arch/csky/kernel/vmlinux.lds.S b/arch/csky/kernel/vmlinux.lds.S

index 2ff37beaf2bf38af39db8c44f58554e832f9909e..f05b413df32849f6000834e2445c1dbb194cdfc7 100644 (file)
--- a/arch/csky/kernel/vmlinux.lds.S
+++ b/arch/csky/kernel/vmlinux.lds.S
@@ -2,6 +2,7 @@
  
  #include <asm/vmlinux.lds.h>
  #include <asm/page.h>
+#include <asm/memory.h>
  
  OUTPUT_ARCH(csky)
  ENTRY(_start)
@@ -53,6 +54,54 @@ SECTIONS
         RW_DATA(L1_CACHE_BYTES, PAGE_SIZE, THREAD_SIZE)
         _edata = .;
  
+#ifdef CONFIG_HAVE_TCM
+       .tcm_start : {
+               . = ALIGN(PAGE_SIZE);
+               __tcm_start = .;
+       }
+
+       .text_data_tcm FIXADDR_TCM : AT(__tcm_start)
+       {
+               . = ALIGN(4);
+               __stcm_text_data = .;
+               *(.tcm.text)
+               *(.tcm.rodata)
+#ifndef CONFIG_HAVE_DTCM
+               *(.tcm.data)
+#endif
+               . = ALIGN(4);
+               __etcm_text_data = .;
+       }
+
+       . = ADDR(.tcm_start) + SIZEOF(.tcm_start) + SIZEOF(.text_data_tcm);
+
+#ifdef CONFIG_HAVE_DTCM
+       #define ITCM_SIZE       CONFIG_ITCM_NR_PAGES * PAGE_SIZE
+
+       .dtcm_start : {
+               __dtcm_start = .;
+       }
+
+       .data_tcm FIXADDR_TCM + ITCM_SIZE : AT(__dtcm_start)
+       {
+               . = ALIGN(4);
+               __stcm_data = .;
+               *(.tcm.data)
+               . = ALIGN(4);
+               __etcm_data = .;
+       }
+
+       . = ADDR(.dtcm_start) + SIZEOF(.data_tcm);
+
+       .tcm_end : AT(ADDR(.dtcm_start) + SIZEOF(.data_tcm)) {
+#else
+       .tcm_end : AT(ADDR(.tcm_start) + SIZEOF(.text_data_tcm)) {
+#endif
+               . = ALIGN(PAGE_SIZE);
+               __tcm_end = .;
+       }
+#endif
+
         EXCEPTION_TABLE(L1_CACHE_BYTES)
         BSS_SECTION(L1_CACHE_BYTES, PAGE_SIZE, L1_CACHE_BYTES)
         VBR_BASE
diff --git a/arch/csky/mm/Makefile b/arch/csky/mm/Makefile

index c94ef648109865e5662edeb1a61c9685426340fa..6e7696e55f71131dd1fde113ecdf177bbe01844c 100644 (file)
--- a/arch/csky/mm/Makefile
+++ b/arch/csky/mm/Makefile
@@ -1,8 +1,10 @@
  # SPDX-License-Identifier: GPL-2.0-only
  ifeq ($(CONFIG_CPU_HAS_CACHEV2),y)
  obj-y +=                       cachev2.o
+CFLAGS_REMOVE_cachev2.o = $(CC_FLAGS_FTRACE)
  else
  obj-y +=                       cachev1.o
+CFLAGS_REMOVE_cachev1.o = $(CC_FLAGS_FTRACE)
  endif
  
  obj-y +=                       dma-mapping.o
@@ -14,3 +16,4 @@ obj-y +=                      syscache.o
  obj-y +=                       tlb.o
  obj-y +=                       asid.o
  obj-y +=                       context.o
+obj-$(CONFIG_HAVE_TCM) +=      tcm.o
diff --git a/arch/csky/mm/cachev1.c b/arch/csky/mm/cachev1.c

index 494ec912abff072ed0a85ba6d6c2b313a27001f2..5a5a9804a0e3d468baa2ff7444d91265d35f9783 100644 (file)
--- a/arch/csky/mm/cachev1.c
+++ b/arch/csky/mm/cachev1.c
@@ -94,6 +94,11 @@ void icache_inv_all(void)
         cache_op_all(INS_CACHE|CACHE_INV, 0);
  }
  
+void local_icache_inv_all(void *priv)
+{
+       cache_op_all(INS_CACHE|CACHE_INV, 0);
+}
+
  void dcache_wb_range(unsigned long start, unsigned long end)
  {
         cache_op_range(start, end, DATA_CACHE|CACHE_CLR, 0);
diff --git a/arch/csky/mm/cachev2.c b/arch/csky/mm/cachev2.c

index b61be6518e214bbed8d8704a6d36b76d1520f479..bc419f8039d3144ddaa14d7e9a1a6352869c9970 100644 (file)
--- a/arch/csky/mm/cachev2.c
+++ b/arch/csky/mm/cachev2.c
@@ -3,15 +3,25 @@
  
  #include <linux/spinlock.h>
  #include <linux/smp.h>
+#include <linux/mm.h>
  #include <asm/cache.h>
  #include <asm/barrier.h>
  
-inline void dcache_wb_line(unsigned long start)
+#define INS_CACHE              (1 << 0)
+#define CACHE_INV              (1 << 4)
+
+void local_icache_inv_all(void *priv)
  {
-       asm volatile("dcache.cval1 %0\n"::"r"(start):"memory");
+       mtcr("cr17", INS_CACHE|CACHE_INV);
         sync_is();
  }
  
+void icache_inv_all(void)
+{
+       on_each_cpu(local_icache_inv_all, NULL, 1);
+}
+
+#ifdef CONFIG_CPU_HAS_ICACHE_INS
  void icache_inv_range(unsigned long start, unsigned long end)
  {
         unsigned long i = start & ~(L1_CACHE_BYTES - 1);
@@ -20,43 +30,32 @@ void icache_inv_range(unsigned long start, unsigned long end)
                 asm volatile("icache.iva %0\n"::"r"(i):"memory");
         sync_is();
  }
-
-void icache_inv_all(void)
+#else
+void icache_inv_range(unsigned long start, unsigned long end)
  {
-       asm volatile("icache.ialls\n":::"memory");
-       sync_is();
+       icache_inv_all();
  }
+#endif
  
-void dcache_wb_range(unsigned long start, unsigned long end)
+inline void dcache_wb_line(unsigned long start)
  {
-       unsigned long i = start & ~(L1_CACHE_BYTES - 1);
-
-       for (; i < end; i += L1_CACHE_BYTES)
-               asm volatile("dcache.cval1 %0\n"::"r"(i):"memory");
+       asm volatile("dcache.cval1 %0\n"::"r"(start):"memory");
         sync_is();
  }
  
-void dcache_inv_range(unsigned long start, unsigned long end)
+void dcache_wb_range(unsigned long start, unsigned long end)
  {
         unsigned long i = start & ~(L1_CACHE_BYTES - 1);
  
         for (; i < end; i += L1_CACHE_BYTES)
-               asm volatile("dcache.civa %0\n"::"r"(i):"memory");
+               asm volatile("dcache.cval1 %0\n"::"r"(i):"memory");
         sync_is();
  }
  
  void cache_wbinv_range(unsigned long start, unsigned long end)
  {
-       unsigned long i = start & ~(L1_CACHE_BYTES - 1);
-
-       for (; i < end; i += L1_CACHE_BYTES)
-               asm volatile("dcache.cval1 %0\n"::"r"(i):"memory");
-       sync_is();
-
-       i = start & ~(L1_CACHE_BYTES - 1);
-       for (; i < end; i += L1_CACHE_BYTES)
-               asm volatile("icache.iva %0\n"::"r"(i):"memory");
-       sync_is();
+       dcache_wb_range(start, end);
+       icache_inv_range(start, end);
  }
  EXPORT_SYMBOL(cache_wbinv_range);
  
diff --git a/arch/csky/mm/highmem.c b/arch/csky/mm/highmem.c

index 3317b774f6dc145eae07b562689d975404279bb9..813129145f3da77c87a516f2f33bf48d8592ed18 100644 (file)
--- a/arch/csky/mm/highmem.c
+++ b/arch/csky/mm/highmem.c
@@ -117,85 +117,29 @@ struct page *kmap_atomic_to_page(void *ptr)
         return pte_page(*pte);
  }
  
-static void __init fixrange_init(unsigned long start, unsigned long end,
-                               pgd_t *pgd_base)
+static void __init kmap_pages_init(void)
  {
-#ifdef CONFIG_HIGHMEM
-       pgd_t *pgd;
-       pud_t *pud;
-       pmd_t *pmd;
-       pte_t *pte;
-       int i, j, k;
         unsigned long vaddr;
-
-       vaddr = start;
-       i = __pgd_offset(vaddr);
-       j = __pud_offset(vaddr);
-       k = __pmd_offset(vaddr);
-       pgd = pgd_base + i;
-
-       for ( ; (i < PTRS_PER_PGD) && (vaddr != end); pgd++, i++) {
-               pud = (pud_t *)pgd;
-               for ( ; (j < PTRS_PER_PUD) && (vaddr != end); pud++, j++) {
-                       pmd = (pmd_t *)pud;
-                       for (; (k < PTRS_PER_PMD) && (vaddr != end); pmd++, k++) {
-                               if (pmd_none(*pmd)) {
-                                       pte = (pte_t *) memblock_alloc_low(PAGE_SIZE, PAGE_SIZE);
-                                       if (!pte)
-                                               panic("%s: Failed to allocate %lu bytes align=%lx\n",
-                                                     __func__, PAGE_SIZE,
-                                                     PAGE_SIZE);
-
-                                       set_pmd(pmd, __pmd(__pa(pte)));
-                                       BUG_ON(pte != pte_offset_kernel(pmd, 0));
-                               }
-                               vaddr += PMD_SIZE;
-                       }
-                       k = 0;
-               }
-               j = 0;
-       }
-#endif
-}
-
-void __init fixaddr_kmap_pages_init(void)
-{
-       unsigned long vaddr;
-       pgd_t *pgd_base;
-#ifdef CONFIG_HIGHMEM
         pgd_t *pgd;
         pmd_t *pmd;
         pud_t *pud;
         pte_t *pte;
-#endif
-       pgd_base = swapper_pg_dir;
-
-       /*
-        * Fixed mappings:
-        */
-       vaddr = __fix_to_virt(__end_of_fixed_addresses - 1) & PMD_MASK;
-       fixrange_init(vaddr, 0, pgd_base);
-
-#ifdef CONFIG_HIGHMEM
-       /*
-        * Permanent kmaps:
-        */
+
         vaddr = PKMAP_BASE;
-       fixrange_init(vaddr, vaddr + PAGE_SIZE*LAST_PKMAP, pgd_base);
+       fixrange_init(vaddr, vaddr + PAGE_SIZE*LAST_PKMAP, swapper_pg_dir);
  
         pgd = swapper_pg_dir + __pgd_offset(vaddr);
         pud = (pud_t *)pgd;
         pmd = pmd_offset(pud, vaddr);
         pte = pte_offset_kernel(pmd, vaddr);
         pkmap_page_table = pte;
-#endif
  }
  
  void __init kmap_init(void)
  {
         unsigned long vaddr;
  
-       fixaddr_kmap_pages_init();
+       kmap_pages_init();
  
         vaddr = __fix_to_virt(FIX_KMAP_BEGIN);
  
diff --git a/arch/csky/mm/init.c b/arch/csky/mm/init.c

index d4c2292ea46bc6cf0825c6af3b9499338577d329..cb64d8647a78b3eb9529b962e6267361e3c15897 100644 (file)
--- a/arch/csky/mm/init.c
+++ b/arch/csky/mm/init.c
@@ -19,6 +19,7 @@
  #include <linux/swap.h>
  #include <linux/proc_fs.h>
  #include <linux/pfn.h>
+#include <linux/initrd.h>
  
  #include <asm/setup.h>
  #include <asm/cachectl.h>
@@ -31,10 +32,50 @@
  
  pgd_t swapper_pg_dir[PTRS_PER_PGD] __page_aligned_bss;
  pte_t invalid_pte_table[PTRS_PER_PTE] __page_aligned_bss;
+EXPORT_SYMBOL(invalid_pte_table);
  unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)]
                                                 __page_aligned_bss;
  EXPORT_SYMBOL(empty_zero_page);
  
+#ifdef CONFIG_BLK_DEV_INITRD
+static void __init setup_initrd(void)
+{
+       unsigned long size;
+
+       if (initrd_start >= initrd_end) {
+               pr_err("initrd not found or empty");
+               goto disable;
+       }
+
+       if (__pa(initrd_end) > PFN_PHYS(max_low_pfn)) {
+               pr_err("initrd extends beyond end of memory");
+               goto disable;
+       }
+
+       size = initrd_end - initrd_start;
+
+       if (memblock_is_region_reserved(__pa(initrd_start), size)) {
+               pr_err("INITRD: 0x%08lx+0x%08lx overlaps in-use memory region",
+                      __pa(initrd_start), size);
+               goto disable;
+       }
+
+       memblock_reserve(__pa(initrd_start), size);
+
+       pr_info("Initial ramdisk at: 0x%p (%lu bytes)\n",
+               (void *)(initrd_start), size);
+
+       initrd_below_start_ok = 1;
+
+       return;
+
+disable:
+       initrd_start = initrd_end = 0;
+
+       pr_err(" - disabling initrd\n");
+}
+#endif
+
  void __init mem_init(void)
  {
  #ifdef CONFIG_HIGHMEM
@@ -46,6 +87,10 @@ void __init mem_init(void)
  #endif
         high_memory = (void *) __va(max_low_pfn << PAGE_SHIFT);
  
+#ifdef CONFIG_BLK_DEV_INITRD
+       setup_initrd();
+#endif
+
         memblock_free_all();
  
  #ifdef CONFIG_HIGHMEM
@@ -101,3 +146,50 @@ void __init pre_mmu_init(void)
         /* Setup page mask to 4k */
         write_mmu_pagemask(0);
  }
+
+void __init fixrange_init(unsigned long start, unsigned long end,
+                       pgd_t *pgd_base)
+{
+       pgd_t *pgd;
+       pud_t *pud;
+       pmd_t *pmd;
+       pte_t *pte;
+       int i, j, k;
+       unsigned long vaddr;
+
+       vaddr = start;
+       i = __pgd_offset(vaddr);
+       j = __pud_offset(vaddr);
+       k = __pmd_offset(vaddr);
+       pgd = pgd_base + i;
+
+       for ( ; (i < PTRS_PER_PGD) && (vaddr != end); pgd++, i++) {
+               pud = (pud_t *)pgd;
+               for ( ; (j < PTRS_PER_PUD) && (vaddr != end); pud++, j++) {
+                       pmd = (pmd_t *)pud;
+                       for (; (k < PTRS_PER_PMD) && (vaddr != end); pmd++, k++) {
+                               if (pmd_none(*pmd)) {
+                                       pte = (pte_t *) memblock_alloc_low(PAGE_SIZE, PAGE_SIZE);
+                                       if (!pte)
+                                               panic("%s: Failed to allocate %lu bytes align=%lx\n",
+                                                     __func__, PAGE_SIZE,
+                                                     PAGE_SIZE);
+
+                                       set_pmd(pmd, __pmd(__pa(pte)));
+                                       BUG_ON(pte != pte_offset_kernel(pmd, 0));
+                               }
+                               vaddr += PMD_SIZE;
+                       }
+                       k = 0;
+               }
+               j = 0;
+       }
+}
+
+void __init fixaddr_init(void)
+{
+       unsigned long vaddr;
+
+       vaddr = __fix_to_virt(__end_of_fixed_addresses - 1) & PMD_MASK;
+       fixrange_init(vaddr, vaddr + PMD_SIZE, swapper_pg_dir);
+}
diff --git a/arch/csky/mm/syscache.c b/arch/csky/mm/syscache.c

index c4645e4e97f4b70d4cc23be29f20cd3aaaff5db0..ffade2f9a4c87914af5c8322fdfc1b9600a204c1 100644 (file)
--- a/arch/csky/mm/syscache.c
+++ b/arch/csky/mm/syscache.c
@@ -3,7 +3,7 @@
  
  #include <linux/syscalls.h>
  #include <asm/page.h>
-#include <asm/cache.h>
+#include <asm/cacheflush.h>
  #include <asm/cachectl.h>
  
  SYSCALL_DEFINE3(cacheflush,
@@ -13,17 +13,14 @@ SYSCALL_DEFINE3(cacheflush,
  {
         switch (cache) {
         case ICACHE:
-               icache_inv_range((unsigned long)addr,
-                                (unsigned long)addr + bytes);
-               break;
+       case BCACHE:
+               flush_icache_mm_range(current->mm,
+                               (unsigned long)addr,
+                               (unsigned long)addr + bytes);
         case DCACHE:
                 dcache_wb_range((unsigned long)addr,
                                 (unsigned long)addr + bytes);
                 break;
-       case BCACHE:
-               cache_wbinv_range((unsigned long)addr,
-                                 (unsigned long)addr + bytes);
-               break;
         default:
                 return -EINVAL;
         }
diff --git a/arch/csky/mm/tcm.c b/arch/csky/mm/tcm.c

new file mode 100644 (file)

index 0000000..ddeb363
--- /dev/null
+++ b/arch/csky/mm/tcm.c
@@ -0,0 +1,169 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/highmem.h>
+#include <linux/genalloc.h>
+#include <asm/tlbflush.h>
+#include <asm/fixmap.h>
+
+#if (CONFIG_ITCM_RAM_BASE == 0xffffffff)
+#error "You should define ITCM_RAM_BASE"
+#endif
+
+#ifdef CONFIG_HAVE_DTCM
+#if (CONFIG_DTCM_RAM_BASE == 0xffffffff)
+#error "You should define DTCM_RAM_BASE"
+#endif
+
+#if (CONFIG_DTCM_RAM_BASE == CONFIG_ITCM_RAM_BASE)
+#error "You should define correct DTCM_RAM_BASE"
+#endif
+#endif
+
+extern char __tcm_start, __tcm_end, __dtcm_start;
+
+static struct gen_pool *tcm_pool;
+
+static void __init tcm_mapping_init(void)
+{
+       pte_t *tcm_pte;
+       unsigned long vaddr, paddr;
+       int i;
+
+       paddr = CONFIG_ITCM_RAM_BASE;
+
+       if (pfn_valid(PFN_DOWN(CONFIG_ITCM_RAM_BASE)))
+               goto panic;
+
+#ifndef CONFIG_HAVE_DTCM
+       for (i = 0; i < TCM_NR_PAGES; i++) {
+#else
+       for (i = 0; i < CONFIG_ITCM_NR_PAGES; i++) {
+#endif
+               vaddr = __fix_to_virt(FIX_TCM - i);
+
+               tcm_pte =
+                       pte_offset_kernel((pmd_t *)pgd_offset_k(vaddr), vaddr);
+
+               set_pte(tcm_pte, pfn_pte(__phys_to_pfn(paddr), PAGE_KERNEL));
+
+               flush_tlb_one(vaddr);
+
+               paddr = paddr + PAGE_SIZE;
+       }
+
+#ifdef CONFIG_HAVE_DTCM
+       if (pfn_valid(PFN_DOWN(CONFIG_DTCM_RAM_BASE)))
+               goto panic;
+
+       paddr = CONFIG_DTCM_RAM_BASE;
+
+       for (i = 0; i < CONFIG_DTCM_NR_PAGES; i++) {
+               vaddr = __fix_to_virt(FIX_TCM - CONFIG_ITCM_NR_PAGES - i);
+
+               tcm_pte =
+                       pte_offset_kernel((pmd_t *) pgd_offset_k(vaddr), vaddr);
+
+               set_pte(tcm_pte, pfn_pte(__phys_to_pfn(paddr), PAGE_KERNEL));
+
+               flush_tlb_one(vaddr);
+
+               paddr = paddr + PAGE_SIZE;
+       }
+#endif
+
+#ifndef CONFIG_HAVE_DTCM
+       memcpy((void *)__fix_to_virt(FIX_TCM),
+                               &__tcm_start, &__tcm_end - &__tcm_start);
+
+       pr_info("%s: mapping tcm va:0x%08lx to pa:0x%08x\n",
+                       __func__, __fix_to_virt(FIX_TCM), CONFIG_ITCM_RAM_BASE);
+
+       pr_info("%s: __tcm_start va:0x%08lx size:%d\n",
+                       __func__, (unsigned long)&__tcm_start, &__tcm_end - &__tcm_start);
+#else
+       memcpy((void *)__fix_to_virt(FIX_TCM),
+                               &__tcm_start, &__dtcm_start - &__tcm_start);
+
+       pr_info("%s: mapping itcm va:0x%08lx to pa:0x%08x\n",
+                       __func__, __fix_to_virt(FIX_TCM), CONFIG_ITCM_RAM_BASE);
+
+       pr_info("%s: __itcm_start va:0x%08lx size:%d\n",
+                       __func__, (unsigned long)&__tcm_start, &__dtcm_start - &__tcm_start);
+
+       memcpy((void *)__fix_to_virt(FIX_TCM - CONFIG_ITCM_NR_PAGES),
+                               &__dtcm_start, &__tcm_end - &__dtcm_start);
+
+       pr_info("%s: mapping dtcm va:0x%08lx to pa:0x%08x\n",
+                       __func__, __fix_to_virt(FIX_TCM - CONFIG_ITCM_NR_PAGES),
+                                               CONFIG_DTCM_RAM_BASE);
+
+       pr_info("%s: __dtcm_start va:0x%08lx size:%d\n",
+                       __func__, (unsigned long)&__dtcm_start, &__tcm_end - &__dtcm_start);
+
+#endif
+       return;
+panic:
+       panic("TCM init error");
+}
+
+void *tcm_alloc(size_t len)
+{
+       unsigned long vaddr;
+
+       if (!tcm_pool)
+               return NULL;
+
+       vaddr = gen_pool_alloc(tcm_pool, len);
+       if (!vaddr)
+               return NULL;
+
+       return (void *) vaddr;
+}
+EXPORT_SYMBOL(tcm_alloc);
+
+void tcm_free(void *addr, size_t len)
+{
+       gen_pool_free(tcm_pool, (unsigned long) addr, len);
+}
+EXPORT_SYMBOL(tcm_free);
+
+static int __init tcm_setup_pool(void)
+{
+#ifndef CONFIG_HAVE_DTCM
+       u32 pool_size = (u32) (TCM_NR_PAGES * PAGE_SIZE)
+                               - (u32) (&__tcm_end - &__tcm_start);
+
+       u32 tcm_pool_start = __fix_to_virt(FIX_TCM)
+                               + (u32) (&__tcm_end - &__tcm_start);
+#else
+       u32 pool_size = (u32) (CONFIG_DTCM_NR_PAGES * PAGE_SIZE)
+                               - (u32) (&__tcm_end - &__dtcm_start);
+
+       u32 tcm_pool_start = __fix_to_virt(FIX_TCM - CONFIG_ITCM_NR_PAGES)
+                               + (u32) (&__tcm_end - &__dtcm_start);
+#endif
+       int ret;
+
+       tcm_pool = gen_pool_create(2, -1);
+
+       ret = gen_pool_add(tcm_pool, tcm_pool_start, pool_size, -1);
+       if (ret) {
+               pr_err("%s: gen_pool add failed!\n", __func__);
+               return ret;
+       }
+
+       pr_info("%s: Added %d bytes @ 0x%08x to memory pool\n",
+               __func__, pool_size, tcm_pool_start);
+
+       return 0;
+}
+
+static int __init tcm_init(void)
+{
+       tcm_mapping_init();
+
+       tcm_setup_pool();
+
+       return 0;
+}
+arch_initcall(tcm_init);
diff --git a/arch/mips/boot/dts/ingenic/jz4740.dtsi b/arch/mips/boot/dts/ingenic/jz4740.dtsi

index 5accda2767bea4d641116206dc470c8b34775a9b..a3301bab9231ae4a2f65859c5713f1b746b7239d 100644 (file)
--- a/arch/mips/boot/dts/ingenic/jz4740.dtsi
+++ b/arch/mips/boot/dts/ingenic/jz4740.dtsi
@@ -1,5 +1,6 @@
  // SPDX-License-Identifier: GPL-2.0
  #include <dt-bindings/clock/jz4740-cgu.h>
+#include <dt-bindings/clock/ingenic,tcu.h>
  
  / {
         #address-cells = <1>;
@@ -45,14 +46,6 @@
                 #clock-cells = <1>;
         };
  
-       watchdog: watchdog@10002000 {
-               compatible = "ingenic,jz4740-watchdog";
-               reg = <0x10002000 0x10>;
-
-               clocks = <&cgu JZ4740_CLK_RTC>;
-               clock-names = "rtc";
-       };
-
         tcu: timer@10002000 {
                 compatible = "ingenic,jz4740-tcu", "simple-mfd";
                 reg = <0x10002000 0x1000>;
@@ -73,6 +66,14 @@
  
                 interrupt-parent = <&intc>;
                 interrupts = <23 22 21>;
+
+               watchdog: watchdog@0 {
+                       compatible = "ingenic,jz4740-watchdog";
+                       reg = <0x0 0xc>;
+
+                       clocks = <&tcu TCU_CLK_WDT>;
+                       clock-names = "wdt";
+               };
         };
  
         rtc_dev: rtc@10003000 {
diff --git a/arch/mips/boot/dts/ingenic/jz4780.dtsi b/arch/mips/boot/dts/ingenic/jz4780.dtsi

index f928329b034b374de13a9d1e0570824a2e7d92c4..bb89653d16a321ab4bb9d136b96bff8eb8d0b2d0 100644 (file)
--- a/arch/mips/boot/dts/ingenic/jz4780.dtsi
+++ b/arch/mips/boot/dts/ingenic/jz4780.dtsi
@@ -1,5 +1,6 @@
  // SPDX-License-Identifier: GPL-2.0
  #include <dt-bindings/clock/jz4780-cgu.h>
+#include <dt-bindings/clock/ingenic,tcu.h>
  #include <dt-bindings/dma/jz4780-dma.h>
  
  / {
@@ -67,6 +68,14 @@
  
                 interrupt-parent = <&intc>;
                 interrupts = <27 26 25>;
+
+               watchdog: watchdog@0 {
+                       compatible = "ingenic,jz4780-watchdog";
+                       reg = <0x0 0xc>;
+
+                       clocks = <&tcu TCU_CLK_WDT>;
+                       clock-names = "wdt";
+               };
         };
  
         rtc_dev: rtc@10003000 {
@@ -348,14 +357,6 @@
                 status = "disabled";
         };
  
-       watchdog: watchdog@10002000 {
-               compatible = "ingenic,jz4780-watchdog";
-               reg = <0x10002000 0x10>;
-
-               clocks = <&cgu JZ4780_CLK_RTCLK>;
-               clock-names = "rtc";
-       };
-
         nemc: nemc@13410000 {
                 compatible = "ingenic,jz4780-nemc";
                 reg = <0x13410000 0x10000>;
diff --git a/arch/mips/boot/dts/ingenic/x1000.dtsi b/arch/mips/boot/dts/ingenic/x1000.dtsi

index 4994c695a1a73ba0bf0b4ca3dd5d3a4591ea75c7..147f7d5c243a2fe5c6cfc3a639f5e4c3299fa7ea 100644 (file)
--- a/arch/mips/boot/dts/ingenic/x1000.dtsi
+++ b/arch/mips/boot/dts/ingenic/x1000.dtsi
@@ -1,4 +1,5 @@
  // SPDX-License-Identifier: GPL-2.0
+#include <dt-bindings/clock/ingenic,tcu.h>
  #include <dt-bindings/clock/x1000-cgu.h>
  #include <dt-bindings/dma/x1000-dma.h>
  
@@ -72,7 +73,7 @@
                         compatible = "ingenic,x1000-watchdog", "ingenic,jz4780-watchdog";
                         reg = <0x0 0x10>;
  
-                       clocks = <&cgu X1000_CLK_RTCLK>;
+                       clocks = <&tcu TCU_CLK_WDT>;
                         clock-names = "wdt";
                 };
         };
@@ -158,7 +159,6 @@
         i2c0: i2c-controller@10050000 {
                 compatible = "ingenic,x1000-i2c";
                 reg = <0x10050000 0x1000>;
-
                 #address-cells = <1>;
                 #size-cells = <0>;
  
@@ -173,7 +173,6 @@
         i2c1: i2c-controller@10051000 {
                 compatible = "ingenic,x1000-i2c";
                 reg = <0x10051000 0x1000>;
-
                 #address-cells = <1>;
                 #size-cells = <0>;
  
@@ -188,7 +187,6 @@
         i2c2: i2c-controller@10052000 {
                 compatible = "ingenic,x1000-i2c";
                 reg = <0x10052000 0x1000>;
-
                 #address-cells = <1>;
                 #size-cells = <0>;
  
diff --git a/arch/mips/include/asm/sync.h b/arch/mips/include/asm/sync.h

index 7c6a1095f556268700909ba31b5b24c1b6d9dc6a..aabd097933fe97f599361dd002d6e9f2fcdd999b 100644 (file)
--- a/arch/mips/include/asm/sync.h
+++ b/arch/mips/include/asm/sync.h
@@ -155,9 +155,11 @@
   * effective barrier as noted by commit 6b07d38aaa52 ("MIPS: Octeon: Use
   * optimized memory barrier primitives."). Here we specify that the affected
   * sync instructions should be emitted twice.
+ * Note that this expression is evaluated by the assembler (not the compiler),
+ * and that the assembler evaluates '==' as 0 or -1, not 0 or 1.
   */
  #ifdef CONFIG_CPU_CAVIUM_OCTEON
-# define __SYNC_rpt(type)      (1 + (type == __SYNC_wmb))
+# define __SYNC_rpt(type)      (1 - (type == __SYNC_wmb))
  #else
  # define __SYNC_rpt(type)      1
  #endif
diff --git a/arch/mips/kernel/vpe.c b/arch/mips/kernel/vpe.c

index 6176b9acba950e7189c48752498a9bb79f3648ad..d0d832ab3d3b86192b57c30f2c8c9e90820965ea 100644 (file)
--- a/arch/mips/kernel/vpe.c
+++ b/arch/mips/kernel/vpe.c
@@ -134,7 +134,7 @@ void release_vpe(struct vpe *v)
  {
         list_del(&v->list);
         if (v->load_addr)
-               release_progmem(v);
+               release_progmem(v->load_addr);
         kfree(v);
  }
  
diff --git a/arch/mips/vdso/Makefile b/arch/mips/vdso/Makefile

index aa89a41dc5dddef97aa080366aa47eb743dfe0eb..d7fe8408603e85583e013337e5df1c87d9ada7cb 100644 (file)
--- a/arch/mips/vdso/Makefile
+++ b/arch/mips/vdso/Makefile
@@ -33,6 +33,7 @@ endif
  cflags-vdso := $(ccflags-vdso) \
         $(filter -W%,$(filter-out -Wa$(comma)%,$(KBUILD_CFLAGS))) \
         -O3 -g -fPIC -fno-strict-aliasing -fno-common -fno-builtin -G 0 \
+       -mrelax-pic-calls $(call cc-option, -mexplicit-relocs) \
         -fno-stack-protector -fno-jump-tables -DDISABLE_BRANCH_PROFILING \
         $(call cc-option, -fno-asynchronous-unwind-tables) \
         $(call cc-option, -fno-stack-protector)
@@ -51,6 +52,8 @@ endif
  
  CFLAGS_REMOVE_vgettimeofday.o = -pg
  
+DISABLE_VDSO := n
+
  #
  # For the pre-R6 code in arch/mips/vdso/vdso.h for locating
  # the base address of VDSO, the linker will emit a R_MIPS_PC32
@@ -64,11 +67,24 @@ CFLAGS_REMOVE_vgettimeofday.o = -pg
  ifndef CONFIG_CPU_MIPSR6
    ifeq ($(call ld-ifversion, -lt, 225000000, y),y)
      $(warning MIPS VDSO requires binutils >= 2.25)
-    obj-vdso-y := $(filter-out vgettimeofday.o, $(obj-vdso-y))
-    ccflags-vdso += -DDISABLE_MIPS_VDSO
+    DISABLE_VDSO := y
    endif
  endif
  
+#
+# GCC (at least up to version 9.2) appears to emit function calls that make use
+# of the GOT when targeting microMIPS, which we can't use in the VDSO due to
+# the lack of relocations. As such, we disable the VDSO for microMIPS builds.
+#
+ifdef CONFIG_CPU_MICROMIPS
+  DISABLE_VDSO := y
+endif
+
+ifeq ($(DISABLE_VDSO),y)
+  obj-vdso-y := $(filter-out vgettimeofday.o, $(obj-vdso-y))
+  ccflags-vdso += -DDISABLE_MIPS_VDSO
+endif
+
  # VDSO linker flags.
  VDSO_LDFLAGS := \
         -Wl,-Bsymbolic -Wl,--no-undefined -Wl,-soname=linux-vdso.so.1 \
@@ -81,12 +97,18 @@ GCOV_PROFILE := n
  UBSAN_SANITIZE := n
  KCOV_INSTRUMENT := n
  
+# Check that we don't have PIC 'jalr t9' calls left
+quiet_cmd_vdso_mips_check = VDSOCHK $@
+      cmd_vdso_mips_check = if $(OBJDUMP) --disassemble $@ | egrep -h "jalr.*t9" > /dev/null; \
+                      then (echo >&2 "$@: PIC 'jalr t9' calls are not supported"; \
+                            rm -f $@; /bin/false); fi
+
  #
  # Shared build commands.
  #
  
  quiet_cmd_vdsold_and_vdso_check = LD      $@
-      cmd_vdsold_and_vdso_check = $(cmd_vdsold); $(cmd_vdso_check)
+      cmd_vdsold_and_vdso_check = $(cmd_vdsold); $(cmd_vdso_check); $(cmd_vdso_mips_check)
  
  quiet_cmd_vdsold = VDSO    $@
        cmd_vdsold = $(CC) $(c_flags) $(VDSO_LDFLAGS) \
diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h

index 86332080399a57d461c3a0efbd1ac2b3debeb9ab..080a0bf8e54bb9cd6267e38adacb2dc15150b401 100644 (file)
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -295,8 +295,13 @@ static inline bool pfn_valid(unsigned long pfn)
  /*
   * Some number of bits at the level of the page table that points to
   * a hugepte are used to encode the size.  This masks those bits.
+ * On 8xx, HW assistance requires 4k alignment for the hugepte.
   */
+#ifdef CONFIG_PPC_8xx
+#define HUGEPD_SHIFT_MASK     0xfff
+#else
  #define HUGEPD_SHIFT_MASK     0x3f
+#endif
  
  #ifndef __ASSEMBLY__
  
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h

index 8387698bd5b629692fc676143e39202ed7b47f66..eedcbfb9a6ff38d874b027c2d639519b83ebba24 100644 (file)
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -168,6 +168,10 @@ struct thread_struct {
         unsigned long   srr1;
         unsigned long   dar;
         unsigned long   dsisr;
+#ifdef CONFIG_PPC_BOOK3S_32
+       unsigned long   r0, r3, r4, r5, r6, r8, r9, r11;
+       unsigned long   lr, ctr;
+#endif
  #endif
         /* Debug Registers */
         struct debug_reg debug;
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c

index c25e562f1cd9d3f1d4fdf93b70bc2f698c2c4427..fcf24a365fc014b0490c957a16c8a9bfd55fc2fd 100644 (file)
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -132,6 +132,18 @@ int main(void)
         OFFSET(SRR1, thread_struct, srr1);
         OFFSET(DAR, thread_struct, dar);
         OFFSET(DSISR, thread_struct, dsisr);
+#ifdef CONFIG_PPC_BOOK3S_32
+       OFFSET(THR0, thread_struct, r0);
+       OFFSET(THR3, thread_struct, r3);
+       OFFSET(THR4, thread_struct, r4);
+       OFFSET(THR5, thread_struct, r5);
+       OFFSET(THR6, thread_struct, r6);
+       OFFSET(THR8, thread_struct, r8);
+       OFFSET(THR9, thread_struct, r9);
+       OFFSET(THR11, thread_struct, r11);
+       OFFSET(THLR, thread_struct, lr);
+       OFFSET(THCTR, thread_struct, ctr);
+#endif
  #endif
  #ifdef CONFIG_SPE
         OFFSET(THREAD_EVR0, thread_struct, evr[0]);
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c

index a1eaffe868de4d76cbb8c401ddb175671e336ef4..7b048cee767c746fefe8a75cb991fb94fe757164 100644 (file)
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -1184,6 +1184,17 @@ void eeh_handle_special_event(void)
                         eeh_pe_state_mark(pe, EEH_PE_RECOVERING);
                         eeh_handle_normal_event(pe);
                 } else {
+                       eeh_for_each_pe(pe, tmp_pe)
+                               eeh_pe_for_each_dev(tmp_pe, edev, tmp_edev)
+                                       edev->mode &= ~EEH_DEV_NO_HANDLER;
+
+                       /* Notify all devices to be down */
+                       eeh_pe_state_clear(pe, EEH_PE_PRI_BUS, true);
+                       eeh_set_channel_state(pe, pci_channel_io_perm_failure);
+                       eeh_pe_report(
+                               "error_detected(permanent failure)", pe,
+                               eeh_report_failure, NULL);
+
                         pci_lock_rescan_remove();
                         list_for_each_entry(hose, &hose_list, list_node) {
                                 phb_pe = eeh_phb_pe_get(hose);
@@ -1192,16 +1203,6 @@ void eeh_handle_special_event(void)
                                     (phb_pe->state & EEH_PE_RECOVERING))
                                         continue;
  
-                               eeh_for_each_pe(pe, tmp_pe)
-                                       eeh_pe_for_each_dev(tmp_pe, edev, tmp_edev)
-                                               edev->mode &= ~EEH_DEV_NO_HANDLER;
-
-                               /* Notify all devices to be down */
-                               eeh_pe_state_clear(pe, EEH_PE_PRI_BUS, true);
-                               eeh_set_channel_state(pe, pci_channel_io_perm_failure);
-                               eeh_pe_report(
-                                       "error_detected(permanent failure)", pe,
-                                       eeh_report_failure, NULL);
                                 bus = eeh_pe_bus_get(phb_pe);
                                 if (!bus) {
                                         pr_err("%s: Cannot find PCI bus for "
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S

index 0713daa651d9e469b2acedfea00d15f12ce6be70..16af0d8d90a8641ae2d43b29cef9d691a08f06e2 100644 (file)
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -783,7 +783,7 @@ fast_exception_return:
  1:     lis     r3,exc_exit_restart_end@ha
         addi    r3,r3,exc_exit_restart_end@l
         cmplw   r12,r3
-#if CONFIG_PPC_BOOK3S_601
+#ifdef CONFIG_PPC_BOOK3S_601
         bge     2b
  #else
         bge     3f
@@ -791,7 +791,7 @@ fast_exception_return:
         lis     r4,exc_exit_restart@ha
         addi    r4,r4,exc_exit_restart@l
         cmplw   r12,r4
-#if CONFIG_PPC_BOOK3S_601
+#ifdef CONFIG_PPC_BOOK3S_601
         blt     2b
  #else
         blt     3f
@@ -1354,12 +1354,17 @@ _GLOBAL(enter_rtas)
         mtspr   SPRN_SRR0,r8
         mtspr   SPRN_SRR1,r9
         RFI
-1:     tophys(r9,r1)
+1:     tophys_novmstack r9, r1
+#ifdef CONFIG_VMAP_STACK
+       li      r0, MSR_KERNEL & ~MSR_IR        /* can take DTLB miss */
+       mtmsr   r0
+       isync
+#endif
         lwz     r8,INT_FRAME_SIZE+4(r9) /* get return address */
         lwz     r9,8(r9)        /* original msr value */
         addi    r1,r1,INT_FRAME_SIZE
         li      r0,0
-       tophys(r7, r2)
+       tophys_novmstack r7, r2
         stw     r0, THREAD + RTAS_SP(r7)
         mtspr   SPRN_SRR0,r8
         mtspr   SPRN_SRR1,r9
diff --git a/arch/powerpc/kernel/head_32.S b/arch/powerpc/kernel/head_32.S

index 0493fcac6409508394932ea267e2f28c2f690f32..97c887950c3ca19d3156b8cc1e7aca6ceaef79f7 100644 (file)
--- a/arch/powerpc/kernel/head_32.S
+++ b/arch/powerpc/kernel/head_32.S
@@ -290,17 +290,55 @@ MachineCheck:
  7:     EXCEPTION_PROLOG_2
         addi    r3,r1,STACK_FRAME_OVERHEAD
  #ifdef CONFIG_PPC_CHRP
-       bne     cr1,1f
+#ifdef CONFIG_VMAP_STACK
+       mfspr   r4, SPRN_SPRG_THREAD
+       tovirt(r4, r4)
+       lwz     r4, RTAS_SP(r4)
+       cmpwi   cr1, r4, 0
  #endif
-       EXC_XFER_STD(0x200, machine_check_exception)
-#ifdef CONFIG_PPC_CHRP
-1:     b       machine_check_in_rtas
+       beq     cr1, machine_check_tramp
+       b       machine_check_in_rtas
+#else
+       b       machine_check_tramp
  #endif
  
  /* Data access exception. */
         . = 0x300
         DO_KVM  0x300
  DataAccess:
+#ifdef CONFIG_VMAP_STACK
+       mtspr   SPRN_SPRG_SCRATCH0,r10
+       mfspr   r10, SPRN_SPRG_THREAD
+BEGIN_MMU_FTR_SECTION
+       stw     r11, THR11(r10)
+       mfspr   r10, SPRN_DSISR
+       mfcr    r11
+#ifdef CONFIG_PPC_KUAP
+       andis.  r10, r10, (DSISR_BAD_FAULT_32S | DSISR_DABRMATCH | DSISR_PROTFAULT)@h
+#else
+       andis.  r10, r10, (DSISR_BAD_FAULT_32S | DSISR_DABRMATCH)@h
+#endif
+       mfspr   r10, SPRN_SPRG_THREAD
+       beq     hash_page_dsi
+.Lhash_page_dsi_cont:
+       mtcr    r11
+       lwz     r11, THR11(r10)
+END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
+       mtspr   SPRN_SPRG_SCRATCH1,r11
+       mfspr   r11, SPRN_DAR
+       stw     r11, DAR(r10)
+       mfspr   r11, SPRN_DSISR
+       stw     r11, DSISR(r10)
+       mfspr   r11, SPRN_SRR0
+       stw     r11, SRR0(r10)
+       mfspr   r11, SPRN_SRR1          /* check whether user or kernel */
+       stw     r11, SRR1(r10)
+       mfcr    r10
+       andi.   r11, r11, MSR_PR
+
+       EXCEPTION_PROLOG_1
+       b       handle_page_fault_tramp_1
+#else  /* CONFIG_VMAP_STACK */
         EXCEPTION_PROLOG handle_dar_dsisr=1
         get_and_save_dar_dsisr_on_stack r4, r5, r11
  BEGIN_MMU_FTR_SECTION
@@ -316,11 +354,32 @@ BEGIN_MMU_FTR_SECTION
  FTR_SECTION_ELSE
         b       handle_page_fault_tramp_2
  ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_HPTE_TABLE)
+#endif /* CONFIG_VMAP_STACK */
  
  /* Instruction access exception. */
         . = 0x400
         DO_KVM  0x400
  InstructionAccess:
+#ifdef CONFIG_VMAP_STACK
+       mtspr   SPRN_SPRG_SCRATCH0,r10
+       mtspr   SPRN_SPRG_SCRATCH1,r11
+       mfspr   r10, SPRN_SPRG_THREAD
+       mfspr   r11, SPRN_SRR0
+       stw     r11, SRR0(r10)
+       mfspr   r11, SPRN_SRR1          /* check whether user or kernel */
+       stw     r11, SRR1(r10)
+       mfcr    r10
+BEGIN_MMU_FTR_SECTION
+       andis.  r11, r11, SRR1_ISI_NOPT@h       /* no pte found? */
+       bne     hash_page_isi
+.Lhash_page_isi_cont:
+       mfspr   r11, SPRN_SRR1          /* check whether user or kernel */
+END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
+       andi.   r11, r11, MSR_PR
+
+       EXCEPTION_PROLOG_1
+       EXCEPTION_PROLOG_2
+#else  /* CONFIG_VMAP_STACK */
         EXCEPTION_PROLOG
         andis.  r0,r9,SRR1_ISI_NOPT@h   /* no pte found? */
         beq     1f                      /* if so, try to put a PTE */
@@ -329,6 +388,7 @@ InstructionAccess:
  BEGIN_MMU_FTR_SECTION
         bl      hash_page
  END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
+#endif /* CONFIG_VMAP_STACK */
  1:     mr      r4,r12
         andis.  r5,r9,DSISR_SRR1_MATCH_32S@h /* Filter relevant SRR1 bits */
         stw     r4, _DAR(r11)
@@ -344,7 +404,7 @@ Alignment:
         EXCEPTION_PROLOG handle_dar_dsisr=1
         save_dar_dsisr_on_stack r4, r5, r11
         addi    r3,r1,STACK_FRAME_OVERHEAD
-       EXC_XFER_STD(0x600, alignment_exception)
+       b       alignment_exception_tramp
  
  /* Program check exception */
         EXCEPTION(0x700, ProgramCheck, program_check_exception, EXC_XFER_STD)
@@ -645,15 +705,100 @@ END_MMU_FTR_SECTION_IFSET(MMU_FTR_NEED_DTLB_SW_LRU)
  
         . = 0x3000
  
+machine_check_tramp:
+       EXC_XFER_STD(0x200, machine_check_exception)
+
+alignment_exception_tramp:
+       EXC_XFER_STD(0x600, alignment_exception)
+
  handle_page_fault_tramp_1:
+#ifdef CONFIG_VMAP_STACK
+       EXCEPTION_PROLOG_2 handle_dar_dsisr=1
+#endif
         lwz     r4, _DAR(r11)
         lwz     r5, _DSISR(r11)
         /* fall through */
  handle_page_fault_tramp_2:
         EXC_XFER_LITE(0x300, handle_page_fault)
  
+#ifdef CONFIG_VMAP_STACK
+.macro save_regs_thread                thread
+       stw     r0, THR0(\thread)
+       stw     r3, THR3(\thread)
+       stw     r4, THR4(\thread)
+       stw     r5, THR5(\thread)
+       stw     r6, THR6(\thread)
+       stw     r8, THR8(\thread)
+       stw     r9, THR9(\thread)
+       mflr    r0
+       stw     r0, THLR(\thread)
+       mfctr   r0
+       stw     r0, THCTR(\thread)
+.endm
+
+.macro restore_regs_thread     thread
+       lwz     r0, THLR(\thread)
+       mtlr    r0
+       lwz     r0, THCTR(\thread)
+       mtctr   r0
+       lwz     r0, THR0(\thread)
+       lwz     r3, THR3(\thread)
+       lwz     r4, THR4(\thread)
+       lwz     r5, THR5(\thread)
+       lwz     r6, THR6(\thread)
+       lwz     r8, THR8(\thread)
+       lwz     r9, THR9(\thread)
+.endm
+
+hash_page_dsi:
+       save_regs_thread        r10
+       mfdsisr r3
+       mfdar   r4
+       mfsrr0  r5
+       mfsrr1  r9
+       rlwinm  r3, r3, 32 - 15, _PAGE_RW       /* DSISR_STORE -> _PAGE_RW */
+       bl      hash_page
+       mfspr   r10, SPRN_SPRG_THREAD
+       restore_regs_thread r10
+       b       .Lhash_page_dsi_cont
+
+hash_page_isi:
+       mr      r11, r10
+       mfspr   r10, SPRN_SPRG_THREAD
+       save_regs_thread        r10
+       li      r3, 0
+       lwz     r4, SRR0(r10)
+       lwz     r9, SRR1(r10)
+       bl      hash_page
+       mfspr   r10, SPRN_SPRG_THREAD
+       restore_regs_thread r10
+       mr      r10, r11
+       b       .Lhash_page_isi_cont
+
+       .globl fast_hash_page_return
+fast_hash_page_return:
+       andis.  r10, r9, SRR1_ISI_NOPT@h        /* Set on ISI, cleared on DSI */
+       mfspr   r10, SPRN_SPRG_THREAD
+       restore_regs_thread r10
+       bne     1f
+
+       /* DSI */
+       mtcr    r11
+       lwz     r11, THR11(r10)
+       mfspr   r10, SPRN_SPRG_SCRATCH0
+       SYNC
+       RFI
+
+1:     /* ISI */
+       mtcr    r11
+       mfspr   r11, SPRN_SPRG_SCRATCH1
+       mfspr   r10, SPRN_SPRG_SCRATCH0
+       SYNC
+       RFI
+
  stack_overflow:
         vmap_stack_overflow_exception
+#endif
  
  AltiVecUnavailable:
         EXCEPTION_PROLOG
diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h

index a6a5fbbf8504ae3cd5d5104076cec9358873dabb..9db162f79fe6e650fd61be32dd80bda9ea844f08 100644 (file)
--- a/arch/powerpc/kernel/head_32.h
+++ b/arch/powerpc/kernel/head_32.h
@@ -64,11 +64,25 @@
  .endm
  
  .macro EXCEPTION_PROLOG_2 handle_dar_dsisr=0
+#if defined(CONFIG_VMAP_STACK) && defined(CONFIG_PPC_BOOK3S)
+BEGIN_MMU_FTR_SECTION
+       mtcr    r10
+FTR_SECTION_ELSE
+       stw     r10, _CCR(r11)
+ALT_MMU_FTR_SECTION_END_IFSET(MMU_FTR_HPTE_TABLE)
+#else
         stw     r10,_CCR(r11)           /* save registers */
+#endif
+       mfspr   r10, SPRN_SPRG_SCRATCH0
         stw     r12,GPR12(r11)
         stw     r9,GPR9(r11)
-       mfspr   r10,SPRN_SPRG_SCRATCH0
         stw     r10,GPR10(r11)
+#if defined(CONFIG_VMAP_STACK) && defined(CONFIG_PPC_BOOK3S)
+BEGIN_MMU_FTR_SECTION
+       mfcr    r10
+       stw     r10, _CCR(r11)
+END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
+#endif
         mfspr   r12,SPRN_SPRG_SCRATCH1
         stw     r12,GPR11(r11)
         mflr    r10
@@ -83,6 +97,11 @@
         stw     r10, _DSISR(r11)
         .endif
         lwz     r9, SRR1(r12)
+#if defined(CONFIG_VMAP_STACK) && defined(CONFIG_PPC_BOOK3S)
+BEGIN_MMU_FTR_SECTION
+       andi.   r10, r9, MSR_PR
+END_MMU_FTR_SECTION_IFSET(MMU_FTR_HPTE_TABLE)
+#endif
         lwz     r12, SRR0(r12)
  #else
         mfspr   r12,SPRN_SRR0
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S

index 9922306ae51244f9dcd0ed1f2f3ffac4603aabee..073a651787df8ab65847aa428fa1d9216d57a231 100644 (file)
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -256,7 +256,7 @@ InstructionTLBMiss:
          * set.  All other Linux PTE bits control the behavior
          * of the MMU.
          */
-       rlwimi  r10, r10, 0, 0x0f00     /* Clear bits 20-23 */
+       rlwinm  r10, r10, 0, ~0x0f00    /* Clear bits 20-23 */
         rlwimi  r10, r10, 4, 0x0400     /* Copy _PAGE_EXEC into bit 21 */
         ori     r10, r10, RPN_PATTERN | 0x200 /* Set 22 and 24-27 */
         mtspr   SPRN_MI_RPN, r10        /* Update TLB entry */
diff --git a/arch/powerpc/kernel/idle_6xx.S b/arch/powerpc/kernel/idle_6xx.S

index 0ffdd18b9f268b2b75f3c305505deb89fb6d09de..433d97bea1f3b8eb2e7554c317c9600a41d53b08 100644 (file)
--- a/arch/powerpc/kernel/idle_6xx.S
+++ b/arch/powerpc/kernel/idle_6xx.S
@@ -166,7 +166,11 @@ BEGIN_FTR_SECTION
         mfspr   r9,SPRN_HID0
         andis.  r9,r9,HID0_NAP@h
         beq     1f
+#ifdef CONFIG_VMAP_STACK
+       addis   r9, r11, nap_save_msscr0@ha
+#else
         addis   r9,r11,(nap_save_msscr0-KERNELBASE)@ha
+#endif
         lwz     r9,nap_save_msscr0@l(r9)
         mtspr   SPRN_MSSCR0, r9
         sync
@@ -174,7 +178,11 @@ BEGIN_FTR_SECTION
  1:
  END_FTR_SECTION_IFSET(CPU_FTR_NAP_DISABLE_L2_PR)
  BEGIN_FTR_SECTION
+#ifdef CONFIG_VMAP_STACK
+       addis   r9, r11, nap_save_hid1@ha
+#else
         addis   r9,r11,(nap_save_hid1-KERNELBASE)@ha
+#endif
         lwz     r9,nap_save_hid1@l(r9)
         mtspr   SPRN_HID1, r9
  END_FTR_SECTION_IFSET(CPU_FTR_DUAL_PLL_750FX)
diff --git a/arch/powerpc/kernel/signal.c b/arch/powerpc/kernel/signal.c

index e6c30cee6abf1748e52fe5a4ae9a0fbc01c3274a..d215f95545537ababac1eb849baab01df0e1642a 100644 (file)
--- a/arch/powerpc/kernel/signal.c
+++ b/arch/powerpc/kernel/signal.c
@@ -200,14 +200,27 @@ unsigned long get_tm_stackpointer(struct task_struct *tsk)
          * normal/non-checkpointed stack pointer.
          */
  
+       unsigned long ret = tsk->thread.regs->gpr[1];
+
  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
         BUG_ON(tsk != current);
  
         if (MSR_TM_ACTIVE(tsk->thread.regs->msr)) {
+               preempt_disable();
                 tm_reclaim_current(TM_CAUSE_SIGNAL);
                 if (MSR_TM_TRANSACTIONAL(tsk->thread.regs->msr))
-                       return tsk->thread.ckpt_regs.gpr[1];
+                       ret = tsk->thread.ckpt_regs.gpr[1];
+
+               /*
+                * If we treclaim, we must clear the current thread's TM bits
+                * before re-enabling preemption. Otherwise we might be
+                * preempted and have the live MSR[TS] changed behind our back
+                * (tm_recheckpoint_new_task() would recheckpoint). Besides, we
+                * enter the signal handler in non-transactional state.
+                */
+               tsk->thread.regs->msr &= ~MSR_TS_MASK;
+               preempt_enable();
         }
  #endif
-       return tsk->thread.regs->gpr[1];
+       return ret;
  }
diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c

index 98600b276f764d957fe5f8d91386c921a288cc90..1b090a76b4444729c4b8614b72277c1643539e92 100644 (file)
--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@ -489,19 +489,11 @@ static int save_user_regs(struct pt_regs *regs, struct mcontext __user *frame,
   */
  static int save_tm_user_regs(struct pt_regs *regs,
                              struct mcontext __user *frame,
-                            struct mcontext __user *tm_frame, int sigret)
+                            struct mcontext __user *tm_frame, int sigret,
+                            unsigned long msr)
  {
-       unsigned long msr = regs->msr;
-
         WARN_ON(tm_suspend_disabled);
  
-       /* Remove TM bits from thread's MSR.  The MSR in the sigcontext
-        * just indicates to userland that we were doing a transaction, but we
-        * don't want to return in transactional state.  This also ensures
-        * that flush_fp_to_thread won't set TIF_RESTORE_TM again.
-        */
-       regs->msr &= ~MSR_TS_MASK;
-
         /* Save both sets of general registers */
         if (save_general_regs(&current->thread.ckpt_regs, frame)
             || save_general_regs(regs, tm_frame))
@@ -912,6 +904,10 @@ int handle_rt_signal32(struct ksignal *ksig, sigset_t *oldset,
         int sigret;
         unsigned long tramp;
         struct pt_regs *regs = tsk->thread.regs;
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+       /* Save the thread's msr before get_tm_stackpointer() changes it */
+       unsigned long msr = regs->msr;
+#endif
  
         BUG_ON(tsk != current);
  
@@ -944,13 +940,13 @@ int handle_rt_signal32(struct ksignal *ksig, sigset_t *oldset,
  
  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
         tm_frame = &rt_sf->uc_transact.uc_mcontext;
-       if (MSR_TM_ACTIVE(regs->msr)) {
+       if (MSR_TM_ACTIVE(msr)) {
                 if (__put_user((unsigned long)&rt_sf->uc_transact,
                                &rt_sf->uc.uc_link) ||
                     __put_user((unsigned long)tm_frame,
                                &rt_sf->uc_transact.uc_regs))
                         goto badframe;
-               if (save_tm_user_regs(regs, frame, tm_frame, sigret))
+               if (save_tm_user_regs(regs, frame, tm_frame, sigret, msr))
                         goto badframe;
         }
         else
@@ -1369,6 +1365,10 @@ int handle_signal32(struct ksignal *ksig, sigset_t *oldset,
         int sigret;
         unsigned long tramp;
         struct pt_regs *regs = tsk->thread.regs;
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+       /* Save the thread's msr before get_tm_stackpointer() changes it */
+       unsigned long msr = regs->msr;
+#endif
  
         BUG_ON(tsk != current);
  
@@ -1402,9 +1402,9 @@ int handle_signal32(struct ksignal *ksig, sigset_t *oldset,
  
  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
         tm_mctx = &frame->mctx_transact;
-       if (MSR_TM_ACTIVE(regs->msr)) {
+       if (MSR_TM_ACTIVE(msr)) {
                 if (save_tm_user_regs(regs, &frame->mctx, &frame->mctx_transact,
-                                     sigret))
+                                     sigret, msr))
                         goto badframe;
         }
         else
diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c

index 117515564ec7a6e2d13ecdeba37e3973934fba10..84ed2e77ef9c3f5e039cf8f65921b5124e0bb226 100644 (file)
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -192,7 +192,8 @@ static long setup_sigcontext(struct sigcontext __user *sc,
  static long setup_tm_sigcontexts(struct sigcontext __user *sc,
                                  struct sigcontext __user *tm_sc,
                                  struct task_struct *tsk,
-                                int signr, sigset_t *set, unsigned long handler)
+                                int signr, sigset_t *set, unsigned long handler,
+                                unsigned long msr)
  {
         /* When CONFIG_ALTIVEC is set, we _always_ setup v_regs even if the
          * process never used altivec yet (MSR_VEC is zero in pt_regs of
@@ -207,12 +208,11 @@ static long setup_tm_sigcontexts(struct sigcontext __user *sc,
         elf_vrreg_t __user *tm_v_regs = sigcontext_vmx_regs(tm_sc);
  #endif
         struct pt_regs *regs = tsk->thread.regs;
-       unsigned long msr = tsk->thread.regs->msr;
         long err = 0;
  
         BUG_ON(tsk != current);
  
-       BUG_ON(!MSR_TM_ACTIVE(regs->msr));
+       BUG_ON(!MSR_TM_ACTIVE(msr));
  
         WARN_ON(tm_suspend_disabled);
  
@@ -222,13 +222,6 @@ static long setup_tm_sigcontexts(struct sigcontext __user *sc,
          */
         msr |= tsk->thread.ckpt_regs.msr & (MSR_FP | MSR_VEC | MSR_VSX);
  
-       /* Remove TM bits from thread's MSR.  The MSR in the sigcontext
-        * just indicates to userland that we were doing a transaction, but we
-        * don't want to return in transactional state.  This also ensures
-        * that flush_fp_to_thread won't set TIF_RESTORE_TM again.
-        */
-       regs->msr &= ~MSR_TS_MASK;
-
  #ifdef CONFIG_ALTIVEC
         err |= __put_user(v_regs, &sc->v_regs);
         err |= __put_user(tm_v_regs, &tm_sc->v_regs);
@@ -824,6 +817,10 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
         unsigned long newsp = 0;
         long err = 0;
         struct pt_regs *regs = tsk->thread.regs;
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+       /* Save the thread's msr before get_tm_stackpointer() changes it */
+       unsigned long msr = regs->msr;
+#endif
  
         BUG_ON(tsk != current);
  
@@ -841,7 +838,7 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
         err |= __put_user(0, &frame->uc.uc_flags);
         err |= __save_altstack(&frame->uc.uc_stack, regs->gpr[1]);
  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
-       if (MSR_TM_ACTIVE(regs->msr)) {
+       if (MSR_TM_ACTIVE(msr)) {
                 /* The ucontext_t passed to userland points to the second
                  * ucontext_t (for transactional state) with its uc_link ptr.
                  */
@@ -849,7 +846,8 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
                 err |= setup_tm_sigcontexts(&frame->uc.uc_mcontext,
                                             &frame->uc_transact.uc_mcontext,
                                             tsk, ksig->sig, NULL,
-                                           (unsigned long)ksig->ka.sa.sa_handler);
+                                           (unsigned long)ksig->ka.sa.sa_handler,
+                                           msr);
         } else
  #endif
         {
diff --git a/arch/powerpc/mm/book3s32/hash_low.S b/arch/powerpc/mm/book3s32/hash_low.S

index c11b0a005196675271839c20a9a0c89cf2253f13..2015c4f962380940700d84bcc32af9d4961b80ea 100644 (file)
--- a/arch/powerpc/mm/book3s32/hash_low.S
+++ b/arch/powerpc/mm/book3s32/hash_low.S
@@ -25,12 +25,6 @@
  #include <asm/feature-fixups.h>
  #include <asm/code-patching-asm.h>
  
-#ifdef CONFIG_VMAP_STACK
-#define ADDR_OFFSET    0
-#else
-#define ADDR_OFFSET    PAGE_OFFSET
-#endif
-
  #ifdef CONFIG_SMP
         .section .bss
         .align  2
@@ -53,8 +47,8 @@ mmu_hash_lock:
         .text
  _GLOBAL(hash_page)
  #ifdef CONFIG_SMP
-       lis     r8, (mmu_hash_lock - ADDR_OFFSET)@h
-       ori     r8, r8, (mmu_hash_lock - ADDR_OFFSET)@l
+       lis     r8, (mmu_hash_lock - PAGE_OFFSET)@h
+       ori     r8, r8, (mmu_hash_lock - PAGE_OFFSET)@l
         lis     r0,0x0fff
         b       10f
  11:    lwz     r6,0(r8)
@@ -72,12 +66,9 @@ _GLOBAL(hash_page)
         cmplw   0,r4,r0
         ori     r3,r3,_PAGE_USER|_PAGE_PRESENT /* test low addresses as user */
         mfspr   r5, SPRN_SPRG_PGDIR     /* phys page-table root */
-#ifdef CONFIG_VMAP_STACK
-       tovirt(r5, r5)
-#endif
         blt+    112f                    /* assume user more likely */
-       lis     r5, (swapper_pg_dir - ADDR_OFFSET)@ha   /* if kernel address, use */
-       addi    r5 ,r5 ,(swapper_pg_dir - ADDR_OFFSET)@l        /* kernel page table */
+       lis     r5, (swapper_pg_dir - PAGE_OFFSET)@ha   /* if kernel address, use */
+       addi    r5 ,r5 ,(swapper_pg_dir - PAGE_OFFSET)@l        /* kernel page table */
         rlwimi  r3,r9,32-12,29,29       /* MSR_PR -> _PAGE_USER */
  112:
  #ifndef CONFIG_PTE_64BIT
@@ -89,9 +80,6 @@ _GLOBAL(hash_page)
         lwzx    r8,r8,r5                /* Get L1 entry */
         rlwinm. r8,r8,0,0,20            /* extract pt base address */
  #endif
-#ifdef CONFIG_VMAP_STACK
-       tovirt(r8, r8)
-#endif
  #ifdef CONFIG_SMP
         beq-    hash_page_out           /* return if no mapping */
  #else
@@ -143,30 +131,36 @@ retry:
         bne-    retry                   /* retry if someone got there first */
  
         mfsrin  r3,r4                   /* get segment reg for segment */
+#ifndef CONFIG_VMAP_STACK
         mfctr   r0
         stw     r0,_CTR(r11)
+#endif
         bl      create_hpte             /* add the hash table entry */
  
  #ifdef CONFIG_SMP
         eieio
-       lis     r8, (mmu_hash_lock - ADDR_OFFSET)@ha
+       lis     r8, (mmu_hash_lock - PAGE_OFFSET)@ha
         li      r0,0
-       stw     r0, (mmu_hash_lock - ADDR_OFFSET)@l(r8)
+       stw     r0, (mmu_hash_lock - PAGE_OFFSET)@l(r8)
  #endif
  
+#ifdef CONFIG_VMAP_STACK
+       b       fast_hash_page_return
+#else
         /* Return from the exception */
         lwz     r5,_CTR(r11)
         mtctr   r5
         lwz     r0,GPR0(r11)
         lwz     r8,GPR8(r11)
         b       fast_exception_return
+#endif
  
  #ifdef CONFIG_SMP
  hash_page_out:
         eieio
-       lis     r8, (mmu_hash_lock - ADDR_OFFSET)@ha
+       lis     r8, (mmu_hash_lock - PAGE_OFFSET)@ha
         li      r0,0
-       stw     r0, (mmu_hash_lock - ADDR_OFFSET)@l(r8)
+       stw     r0, (mmu_hash_lock - PAGE_OFFSET)@l(r8)
         blr
  #endif /* CONFIG_SMP */
  
@@ -341,7 +335,7 @@ END_FTR_SECTION_IFCLR(CPU_FTR_NEED_COHERENT)
         patch_site      1f, patch__hash_page_A1
         patch_site      2f, patch__hash_page_A2
         /* Get the address of the primary PTE group in the hash table (r3) */
-0:     lis     r0, (Hash_base - ADDR_OFFSET)@h /* base address of hash table */
+0:     lis     r0, (Hash_base - PAGE_OFFSET)@h /* base address of hash table */
  1:     rlwimi  r0,r3,LG_PTEG_SIZE,HASH_LEFT,HASH_RIGHT    /* VSID -> hash */
  2:     rlwinm  r3,r4,20+LG_PTEG_SIZE,HASH_LEFT,HASH_RIGHT /* PI -> hash */
         xor     r3,r3,r0                /* make primary hash */
@@ -355,10 +349,10 @@ END_FTR_SECTION_IFCLR(CPU_FTR_NEED_COHERENT)
         beq+    10f                     /* no PTE: go look for an empty slot */
         tlbie   r4
  
-       lis     r4, (htab_hash_searches - ADDR_OFFSET)@ha
-       lwz     r6, (htab_hash_searches - ADDR_OFFSET)@l(r4)
+       lis     r4, (htab_hash_searches - PAGE_OFFSET)@ha
+       lwz     r6, (htab_hash_searches - PAGE_OFFSET)@l(r4)
         addi    r6,r6,1                 /* count how many searches we do */
-       stw     r6, (htab_hash_searches - ADDR_OFFSET)@l(r4)
+       stw     r6, (htab_hash_searches - PAGE_OFFSET)@l(r4)
  
         /* Search the primary PTEG for a PTE whose 1st (d)word matches r5 */
         mtctr   r0
@@ -390,10 +384,10 @@ END_FTR_SECTION_IFCLR(CPU_FTR_NEED_COHERENT)
         beq+    found_empty
  
         /* update counter of times that the primary PTEG is full */
-       lis     r4, (primary_pteg_full - ADDR_OFFSET)@ha
-       lwz     r6, (primary_pteg_full - ADDR_OFFSET)@l(r4)
+       lis     r4, (primary_pteg_full - PAGE_OFFSET)@ha
+       lwz     r6, (primary_pteg_full - PAGE_OFFSET)@l(r4)
         addi    r6,r6,1
-       stw     r6, (primary_pteg_full - ADDR_OFFSET)@l(r4)
+       stw     r6, (primary_pteg_full - PAGE_OFFSET)@l(r4)
  
         patch_site      0f, patch__hash_page_C
         /* Search the secondary PTEG for an empty slot */
@@ -427,8 +421,8 @@ END_FTR_SECTION_IFCLR(CPU_FTR_NEED_COHERENT)
          * lockup here but that shouldn't happen
          */
  
-1:     lis     r4, (next_slot - ADDR_OFFSET)@ha        /* get next evict slot */
-       lwz     r6, (next_slot - ADDR_OFFSET)@l(r4)
+1:     lis     r4, (next_slot - PAGE_OFFSET)@ha        /* get next evict slot */
+       lwz     r6, (next_slot - PAGE_OFFSET)@l(r4)
         addi    r6,r6,HPTE_SIZE                 /* search for candidate */
         andi.   r6,r6,7*HPTE_SIZE
         stw     r6,next_slot@l(r4)
diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c

index 0a1c65a2c56553cf36836af6bc30ef2424489980..f888cbb109b9134daee3176b37988b250ef008f5 100644 (file)
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -413,7 +413,7 @@ void __init MMU_init_hw(void)
  void __init MMU_init_hw_patch(void)
  {
         unsigned int hmask = Hash_mask >> (16 - LG_HPTEG_SIZE);
-       unsigned int hash;
+       unsigned int hash = (unsigned int)Hash - PAGE_OFFSET;
  
         if (ppc_md.progress)
                 ppc_md.progress("hash:patch", 0x345);
@@ -425,11 +425,6 @@ void __init MMU_init_hw_patch(void)
         /*
          * Patch up the instructions in hashtable.S:create_hpte
          */
-       if (IS_ENABLED(CONFIG_VMAP_STACK))
-               hash = (unsigned int)Hash;
-       else
-               hash = (unsigned int)Hash - PAGE_OFFSET;
-
         modify_instruction_site(&patch__hash_page_A0, 0xffff, hash >> 16);
         modify_instruction_site(&patch__hash_page_A1, 0x7c0, hash_mb << 6);
         modify_instruction_site(&patch__hash_page_A2, 0x7c0, hash_mb2 << 6);
@@ -439,8 +434,7 @@ void __init MMU_init_hw_patch(void)
         /*
          * Patch up the instructions in hashtable.S:flush_hash_page
          */
-       modify_instruction_site(&patch__flush_hash_A0, 0xffff,
-                               ((unsigned int)Hash - PAGE_OFFSET) >> 16);
+       modify_instruction_site(&patch__flush_hash_A0, 0xffff, hash >> 16);
         modify_instruction_site(&patch__flush_hash_A1, 0x7c0, hash_mb << 6);
         modify_instruction_site(&patch__flush_hash_A2, 0x7c0, hash_mb2 << 6);
         modify_instruction_site(&patch__flush_hash_B, 0xffff, hmask);
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c

index 73d4873fc7f85442eafc08399446bc176fd82b9a..33b3461d91e8db0e7b19f569a5c20d06a9a827be 100644 (file)
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -53,20 +53,24 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
         if (pshift >= pdshift) {
                 cachep = PGT_CACHE(PTE_T_ORDER);
                 num_hugepd = 1 << (pshift - pdshift);
+               new = NULL;
         } else if (IS_ENABLED(CONFIG_PPC_8xx)) {
-               cachep = PGT_CACHE(PTE_INDEX_SIZE);
+               cachep = NULL;
                 num_hugepd = 1;
+               new = pte_alloc_one(mm);
         } else {
                 cachep = PGT_CACHE(pdshift - pshift);
                 num_hugepd = 1;
+               new = NULL;
         }
  
-       if (!cachep) {
+       if (!cachep && !new) {
                 WARN_ONCE(1, "No page table cache created for hugetlb tables");
                 return -ENOMEM;
         }
  
-       new = kmem_cache_alloc(cachep, pgtable_gfp_flags(mm, GFP_KERNEL));
+       if (cachep)
+               new = kmem_cache_alloc(cachep, pgtable_gfp_flags(mm, GFP_KERNEL));
  
         BUG_ON(pshift > HUGEPD_SHIFT_MASK);
         BUG_ON((unsigned long)new & HUGEPD_SHIFT_MASK);
@@ -97,7 +101,10 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
         if (i < num_hugepd) {
                 for (i = i - 1 ; i >= 0; i--, hpdp--)
                         *hpdp = __hugepd(0);
-               kmem_cache_free(cachep, new);
+               if (cachep)
+                       kmem_cache_free(cachep, new);
+               else
+                       pte_free(mm, new);
         } else {
                 kmemleak_ignore(new);
         }
@@ -324,8 +331,7 @@ static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int pdshif
         if (shift >= pdshift)
                 hugepd_free(tlb, hugepte);
         else if (IS_ENABLED(CONFIG_PPC_8xx))
-               pgtable_free_tlb(tlb, hugepte,
-                                get_hugepd_cache_index(PTE_INDEX_SIZE));
+               pgtable_free_tlb(tlb, hugepte, 0);
         else
                 pgtable_free_tlb(tlb, hugepte,
                                  get_hugepd_cache_index(pdshift - shift));
@@ -639,12 +645,13 @@ static int __init hugetlbpage_init(void)
                  * if we have pdshift and shift value same, we don't
                  * use pgt cache for hugepd.
                  */
-               if (pdshift > shift && IS_ENABLED(CONFIG_PPC_8xx))
-                       pgtable_cache_add(PTE_INDEX_SIZE);
-               else if (pdshift > shift)
-                       pgtable_cache_add(pdshift - shift);
-               else if (IS_ENABLED(CONFIG_PPC_FSL_BOOK3E) || IS_ENABLED(CONFIG_PPC_8xx))
+               if (pdshift > shift) {
+                       if (!IS_ENABLED(CONFIG_PPC_8xx))
+                               pgtable_cache_add(pdshift - shift);
+               } else if (IS_ENABLED(CONFIG_PPC_FSL_BOOK3E) ||
+                          IS_ENABLED(CONFIG_PPC_8xx)) {
                         pgtable_cache_add(PTE_T_ORDER);
+               }
  
                 configured = true;
         }
diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c b/arch/powerpc/mm/kasan/kasan_init_32.c

index 16dd95bd0749d851410fce5ae727a0e9edfdb563..db5664dde5ff9d9d6a461eb1ed25a2d0f11e3d8e 100644 (file)
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -185,8 +185,7 @@ u8 __initdata early_hash[256 << 10] __aligned(256 << 10) = {0};
  
  static void __init kasan_early_hash_table(void)
  {
-       unsigned int hash = IS_ENABLED(CONFIG_VMAP_STACK) ? (unsigned int)early_hash :
-                                                           __pa(early_hash);
+       unsigned int hash = __pa(early_hash);
  
         modify_instruction_site(&patch__hash_page_A0, 0xffff, hash >> 16);
         modify_instruction_site(&patch__flush_hash_A0, 0xffff, hash >> 16);
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c

index e8c84d265602bc57bf47b49243a2044d93f2336d..0ec9640335bb3544548bcf3c391b2dd946621f4b 100644 (file)
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -3435,6 +3435,11 @@ getstring(char *s, int size)
         int c;
  
         c = skipbl();
+       if (c == '\n') {
+               *s = 0;
+               return;
+       }
+
         do {
                 if( size > 1 ){
                         *s++ = c;
diff --git a/arch/riscv/boot/.gitignore b/arch/riscv/boot/.gitignore

index 8dab0bb6ae667c5f89da8aca74a0a3bb61dcfaa9..8a45a37d2af4ccf129e4de4002529c853b224d8a 100644 (file)
--- a/arch/riscv/boot/.gitignore
+++ b/arch/riscv/boot/.gitignore
@@ -1,2 +1,4 @@
  Image
  Image.gz
+loader
+loader.lds
diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h

index 435b65532e2945703cc17a9e5c4f7613a0d5b6b8..8e18d2c64399df91e0619852bfd5c9a4757cdc37 100644 (file)
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -72,6 +72,16 @@
  #define EXC_LOAD_PAGE_FAULT    13
  #define EXC_STORE_PAGE_FAULT   15
  
+/* PMP configuration */
+#define PMP_R                  0x01
+#define PMP_W                  0x02
+#define PMP_X                  0x04
+#define PMP_A                  0x18
+#define PMP_A_TOR              0x08
+#define PMP_A_NA4              0x10
+#define PMP_A_NAPOT            0x18
+#define PMP_L                  0x80
+
  /* symbolic CSR names: */
  #define CSR_CYCLE              0xc00
  #define CSR_TIME               0xc01
@@ -100,6 +110,8 @@
  #define CSR_MCAUSE             0x342
  #define CSR_MTVAL              0x343
  #define CSR_MIP                        0x344
+#define CSR_PMPCFG0            0x3a0
+#define CSR_PMPADDR0           0x3b0
  #define CSR_MHARTID            0xf14
  
  #ifdef CONFIG_RISCV_M_MODE
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S

index 271860fc2c3f02a5d50552339a005004444ab456..85f2073e7fe4abee5dcb97f91cd2f07cf9d9f9e7 100644 (file)
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -58,6 +58,12 @@ _start_kernel:
         /* Reset all registers except ra, a0, a1 */
         call reset_regs
  
+       /* Setup a PMP to permit access to all of memory. */
+       li a0, -1
+       csrw CSR_PMPADDR0, a0
+       li a0, (PMP_A_NAPOT | PMP_R | PMP_W | PMP_X)
+       csrw CSR_PMPCFG0, a0
+
         /*
          * The hartid in a0 is expected later on, and we have no firmware
          * to hand it to us.
diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c

index f4cad5163bf2c0160f64e828ba07e2d3b84623d2..ffb3d94bf0cc2782f5777e1423f6a3fc4310099f 100644 (file)
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -156,6 +156,6 @@ void __init trap_init(void)
         csr_write(CSR_SCRATCH, 0);
         /* Set the exception vector address */
         csr_write(CSR_TVEC, &handle_exception);
-       /* Enable all interrupts */
-       csr_write(CSR_IE, -1);
+       /* Enable interrupts */
+       csr_write(CSR_IE, IE_SIE | IE_EIE);
  }
diff --git a/arch/riscv/mm/kasan_init.c b/arch/riscv/mm/kasan_init.c

index f0cc860405871d0457d6dc0fdc83fb3a0350a2c5..ec0ca90dd9000f339a3d8a739d59e84df386d0dc 100644 (file)
--- a/arch/riscv/mm/kasan_init.c
+++ b/arch/riscv/mm/kasan_init.c
@@ -19,18 +19,20 @@ asmlinkage void __init kasan_early_init(void)
         for (i = 0; i < PTRS_PER_PTE; ++i)
                 set_pte(kasan_early_shadow_pte + i,
                         mk_pte(virt_to_page(kasan_early_shadow_page),
-                       PAGE_KERNEL));
+                              PAGE_KERNEL));
  
         for (i = 0; i < PTRS_PER_PMD; ++i)
                 set_pmd(kasan_early_shadow_pmd + i,
-                pfn_pmd(PFN_DOWN(__pa((uintptr_t)kasan_early_shadow_pte)),
-                       __pgprot(_PAGE_TABLE)));
+                       pfn_pmd(PFN_DOWN
+                               (__pa((uintptr_t) kasan_early_shadow_pte)),
+                               __pgprot(_PAGE_TABLE)));
  
         for (i = KASAN_SHADOW_START; i < KASAN_SHADOW_END;
              i += PGDIR_SIZE, ++pgd)
                 set_pgd(pgd,
-                pfn_pgd(PFN_DOWN(__pa(((uintptr_t)kasan_early_shadow_pmd))),
-                       __pgprot(_PAGE_TABLE)));
+                       pfn_pgd(PFN_DOWN
+                               (__pa(((uintptr_t) kasan_early_shadow_pmd))),
+                               __pgprot(_PAGE_TABLE)));
  
         /* init for swapper_pg_dir */
         pgd = pgd_offset_k(KASAN_SHADOW_START);
@@ -38,37 +40,43 @@ asmlinkage void __init kasan_early_init(void)
         for (i = KASAN_SHADOW_START; i < KASAN_SHADOW_END;
              i += PGDIR_SIZE, ++pgd)
                 set_pgd(pgd,
-                pfn_pgd(PFN_DOWN(__pa(((uintptr_t)kasan_early_shadow_pmd))),
-                       __pgprot(_PAGE_TABLE)));
+                       pfn_pgd(PFN_DOWN
+                               (__pa(((uintptr_t) kasan_early_shadow_pmd))),
+                               __pgprot(_PAGE_TABLE)));
  
         flush_tlb_all();
  }
  
  static void __init populate(void *start, void *end)
  {
-       unsigned long i;
+       unsigned long i, offset;
         unsigned long vaddr = (unsigned long)start & PAGE_MASK;
         unsigned long vend = PAGE_ALIGN((unsigned long)end);
         unsigned long n_pages = (vend - vaddr) / PAGE_SIZE;
+       unsigned long n_ptes =
+           ((n_pages + PTRS_PER_PTE) & -PTRS_PER_PTE) / PTRS_PER_PTE;
         unsigned long n_pmds =
-               (n_pages % PTRS_PER_PTE) ? n_pages / PTRS_PER_PTE + 1 :
-                                               n_pages / PTRS_PER_PTE;
+           ((n_ptes + PTRS_PER_PMD) & -PTRS_PER_PMD) / PTRS_PER_PMD;
+
+       pte_t *pte =
+           memblock_alloc(n_ptes * PTRS_PER_PTE * sizeof(pte_t), PAGE_SIZE);
+       pmd_t *pmd =
+           memblock_alloc(n_pmds * PTRS_PER_PMD * sizeof(pmd_t), PAGE_SIZE);
         pgd_t *pgd = pgd_offset_k(vaddr);
-       pmd_t *pmd = memblock_alloc(n_pmds * sizeof(pmd_t), PAGE_SIZE);
-       pte_t *pte = memblock_alloc(n_pages * sizeof(pte_t), PAGE_SIZE);
  
         for (i = 0; i < n_pages; i++) {
                 phys_addr_t phys = memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE);
-
-               set_pte(pte + i, pfn_pte(PHYS_PFN(phys), PAGE_KERNEL));
+               set_pte(&pte[i], pfn_pte(PHYS_PFN(phys), PAGE_KERNEL));
         }
  
-       for (i = 0; i < n_pmds; ++pgd, i += PTRS_PER_PMD)
-               set_pgd(pgd, pfn_pgd(PFN_DOWN(__pa(((uintptr_t)(pmd + i)))),
+       for (i = 0, offset = 0; i < n_ptes; i++, offset += PTRS_PER_PTE)
+               set_pmd(&pmd[i],
+                       pfn_pmd(PFN_DOWN(__pa(&pte[offset])),
                                 __pgprot(_PAGE_TABLE)));
  
-       for (i = 0; i < n_pages; ++pmd, i += PTRS_PER_PTE)
-               set_pmd(pmd, pfn_pmd(PFN_DOWN(__pa((uintptr_t)(pte + i))),
+       for (i = 0, offset = 0; i < n_pmds; i++, offset += PTRS_PER_PMD)
+               set_pgd(&pgd[i],
+                       pfn_pgd(PFN_DOWN(__pa(&pmd[offset])),
                                 __pgprot(_PAGE_TABLE)));
  
         flush_tlb_all();
@@ -81,7 +89,8 @@ void __init kasan_init(void)
         unsigned long i;
  
         kasan_populate_early_shadow((void *)KASAN_SHADOW_START,
-                       (void *)kasan_mem_to_shadow((void *)VMALLOC_END));
+                                   (void *)kasan_mem_to_shadow((void *)
+                                                               VMALLOC_END));
  
         for_each_memblock(memory, reg) {
                 void *start = (void *)__va(reg->base);
@@ -90,14 +99,14 @@ void __init kasan_init(void)
                 if (start >= end)
                         break;
  
-               populate(kasan_mem_to_shadow(start),
-                        kasan_mem_to_shadow(end));
+               populate(kasan_mem_to_shadow(start), kasan_mem_to_shadow(end));
         };
  
         for (i = 0; i < PTRS_PER_PTE; i++)
                 set_pte(&kasan_early_shadow_pte[i],
                         mk_pte(virt_to_page(kasan_early_shadow_page),
-                       __pgprot(_PAGE_PRESENT | _PAGE_READ | _PAGE_ACCESSED)));
+                              __pgprot(_PAGE_PRESENT | _PAGE_READ |
+                                       _PAGE_ACCESSED)));
  
         memset(kasan_early_shadow_page, 0, PAGE_SIZE);
         init_task.kasan_depth = 0;
diff --git a/arch/s390/Makefile b/arch/s390/Makefile

index e0e3a465bbfd6fd15eb44ab8675af10c171c2e01..8dfa2cf1f05c7a9028683aa1900e0975d27863e9 100644 (file)
--- a/arch/s390/Makefile
+++ b/arch/s390/Makefile
@@ -146,7 +146,7 @@ all: bzImage
  #KBUILD_IMAGE is necessary for packaging targets like rpm-pkg, deb-pkg...
  KBUILD_IMAGE   := $(boot)/bzImage
  
-install: vmlinux
+install:
         $(Q)$(MAKE) $(build)=$(boot) $@
  
  bzImage: vmlinux
diff --git a/arch/s390/boot/Makefile b/arch/s390/boot/Makefile

index e2c47d3a1c891b7bf5bb783f156ca902c69f1fdf..0ff9261c915e3fa64369b9e484c3baaed909b6aa 100644 (file)
--- a/arch/s390/boot/Makefile
+++ b/arch/s390/boot/Makefile
@@ -70,7 +70,7 @@ $(obj)/compressed/vmlinux: $(obj)/startup.a FORCE
  $(obj)/startup.a: $(OBJECTS) FORCE
         $(call if_changed,ar)
  
-install: $(CONFIGURE) $(obj)/bzImage
+install:
         sh -x  $(srctree)/$(obj)/install.sh $(KERNELRELEASE) $(obj)/bzImage \
               System.map "$(INSTALL_PATH)"
  
diff --git a/arch/s390/boot/kaslr.c b/arch/s390/boot/kaslr.c

index 5d12352545c558d4f5598fd074048029cfd6218a..5591243d673e82cea561c5df4c9e6ac19c55ed6f 100644 (file)
--- a/arch/s390/boot/kaslr.c
+++ b/arch/s390/boot/kaslr.c
@@ -75,7 +75,7 @@ static unsigned long get_random(unsigned long limit)
                 *(unsigned long *) prng.parm_block ^= seed;
                 for (i = 0; i < 16; i++) {
                         cpacf_kmc(CPACF_KMC_PRNG, prng.parm_block,
-                                 (char *) entropy, (char *) entropy,
+                                 (u8 *) entropy, (u8 *) entropy,
                                   sizeof(entropy));
                         memcpy(prng.parm_block, entropy, sizeof(entropy));
                 }
diff --git a/arch/s390/boot/uv.c b/arch/s390/boot/uv.c

index ed007f4a6444ff84b7c7abac01b442459e619458..3f501159ee9fc2ac1200219337859a03bb02c99f 100644 (file)
--- a/arch/s390/boot/uv.c
+++ b/arch/s390/boot/uv.c
@@ -15,7 +15,8 @@ void uv_query_info(void)
         if (!test_facility(158))
                 return;
  
-       if (uv_call(0, (uint64_t)&uvcb))
+       /* rc==0x100 means that there is additional data we do not process */
+       if (uv_call(0, (uint64_t)&uvcb) && uvcb.header.rc != 0x100)
                 return;
  
         if (test_bit_inv(BIT_UVC_CMD_SET_SHARED_ACCESS, (unsigned long *)uvcb.inst_calls_list) &&
diff --git a/arch/s390/configs/debug_defconfig b/arch/s390/configs/debug_defconfig

index 2e60c80395ab084afbe4713fbc8816f42bea3f60..0c86ba19fa2bcdb0ca9ce4627366b59714b379eb 100644 (file)
--- a/arch/s390/configs/debug_defconfig
+++ b/arch/s390/configs/debug_defconfig
@@ -53,6 +53,7 @@ CONFIG_VFIO_AP=m
  CONFIG_CRASH_DUMP=y
  CONFIG_HIBERNATION=y
  CONFIG_PM_DEBUG=y
+CONFIG_PROTECTED_VIRTUALIZATION_GUEST=y
  CONFIG_CMM=m
  CONFIG_APPLDATA_BASE=y
  CONFIG_KVM=m
@@ -474,7 +475,6 @@ CONFIG_NLMON=m
  # CONFIG_NET_VENDOR_EMULEX is not set
  # CONFIG_NET_VENDOR_EZCHIP is not set
  # CONFIG_NET_VENDOR_GOOGLE is not set
-# CONFIG_NET_VENDOR_HP is not set
  # CONFIG_NET_VENDOR_HUAWEI is not set
  # CONFIG_NET_VENDOR_INTEL is not set
  # CONFIG_NET_VENDOR_MARVELL is not set
@@ -684,7 +684,6 @@ CONFIG_CRYPTO_ADIANTUM=m
  CONFIG_CRYPTO_XCBC=m
  CONFIG_CRYPTO_VMAC=m
  CONFIG_CRYPTO_CRC32=m
-CONFIG_CRYPTO_XXHASH=m
  CONFIG_CRYPTO_MICHAEL_MIC=m
  CONFIG_CRYPTO_RMD128=m
  CONFIG_CRYPTO_RMD160=m
@@ -748,7 +747,6 @@ CONFIG_DEBUG_INFO_DWARF4=y
  CONFIG_GDB_SCRIPTS=y
  CONFIG_FRAME_WARN=1024
  CONFIG_HEADERS_INSTALL=y
-CONFIG_HEADERS_CHECK=y
  CONFIG_DEBUG_SECTION_MISMATCH=y
  CONFIG_MAGIC_SYSRQ=y
  CONFIG_DEBUG_PAGEALLOC=y
@@ -772,9 +770,9 @@ CONFIG_DEBUG_MEMORY_INIT=y
  CONFIG_MEMORY_NOTIFIER_ERROR_INJECT=m
  CONFIG_DEBUG_PER_CPU_MAPS=y
  CONFIG_DEBUG_SHIRQ=y
+CONFIG_PANIC_ON_OOPS=y
  CONFIG_DETECT_HUNG_TASK=y
  CONFIG_WQ_WATCHDOG=y
-CONFIG_PANIC_ON_OOPS=y
  CONFIG_DEBUG_TIMEKEEPING=y
  CONFIG_PROVE_LOCKING=y
  CONFIG_LOCK_STAT=y
@@ -783,9 +781,20 @@ CONFIG_DEBUG_ATOMIC_SLEEP=y
  CONFIG_DEBUG_LOCKING_API_SELFTESTS=y
  CONFIG_DEBUG_SG=y
  CONFIG_DEBUG_NOTIFIERS=y
+CONFIG_BUG_ON_DATA_CORRUPTION=y
  CONFIG_DEBUG_CREDENTIALS=y
  CONFIG_RCU_TORTURE_TEST=m
  CONFIG_RCU_CPU_STALL_TIMEOUT=300
+CONFIG_LATENCYTOP=y
+CONFIG_FUNCTION_PROFILER=y
+CONFIG_STACK_TRACER=y
+CONFIG_IRQSOFF_TRACER=y
+CONFIG_PREEMPT_TRACER=y
+CONFIG_SCHED_TRACER=y
+CONFIG_FTRACE_SYSCALLS=y
+CONFIG_BLK_DEV_IO_TRACE=y
+CONFIG_HIST_TRIGGERS=y
+CONFIG_S390_PTDUMP=y
  CONFIG_NOTIFIER_ERROR_INJECTION=m
  CONFIG_NETDEV_NOTIFIER_ERROR_INJECT=m
  CONFIG_FAULT_INJECTION=y
@@ -796,15 +805,6 @@ CONFIG_FAIL_IO_TIMEOUT=y
  CONFIG_FAIL_FUTEX=y
  CONFIG_FAULT_INJECTION_DEBUG_FS=y
  CONFIG_FAULT_INJECTION_STACKTRACE_FILTER=y
-CONFIG_LATENCYTOP=y
-CONFIG_IRQSOFF_TRACER=y
-CONFIG_PREEMPT_TRACER=y
-CONFIG_SCHED_TRACER=y
-CONFIG_FTRACE_SYSCALLS=y
-CONFIG_STACK_TRACER=y
-CONFIG_BLK_DEV_IO_TRACE=y
-CONFIG_FUNCTION_PROFILER=y
-CONFIG_HIST_TRIGGERS=y
  CONFIG_LKDTM=m
  CONFIG_TEST_LIST_SORT=y
  CONFIG_TEST_SORT=y
@@ -814,5 +814,3 @@ CONFIG_INTERVAL_TREE_TEST=m
  CONFIG_PERCPU_TEST=m
  CONFIG_ATOMIC64_SELFTEST=y
  CONFIG_TEST_BPF=m
-CONFIG_BUG_ON_DATA_CORRUPTION=y
-CONFIG_S390_PTDUMP=y
diff --git a/arch/s390/configs/defconfig b/arch/s390/configs/defconfig

index 25f79984958219506566216ca4a34db0e49a31e4..6b27d861a9a306e03cb7f319339cabf16a0c6de3 100644 (file)
--- a/arch/s390/configs/defconfig
+++ b/arch/s390/configs/defconfig
@@ -53,6 +53,7 @@ CONFIG_VFIO_AP=m
  CONFIG_CRASH_DUMP=y
  CONFIG_HIBERNATION=y
  CONFIG_PM_DEBUG=y
+CONFIG_PROTECTED_VIRTUALIZATION_GUEST=y
  CONFIG_CMM=m
  CONFIG_APPLDATA_BASE=y
  CONFIG_KVM=m
@@ -470,7 +471,6 @@ CONFIG_NLMON=m
  # CONFIG_NET_VENDOR_EMULEX is not set
  # CONFIG_NET_VENDOR_EZCHIP is not set
  # CONFIG_NET_VENDOR_GOOGLE is not set
-# CONFIG_NET_VENDOR_HP is not set
  # CONFIG_NET_VENDOR_HUAWEI is not set
  # CONFIG_NET_VENDOR_INTEL is not set
  # CONFIG_NET_VENDOR_MARVELL is not set
@@ -677,7 +677,6 @@ CONFIG_CRYPTO_ADIANTUM=m
  CONFIG_CRYPTO_XCBC=m
  CONFIG_CRYPTO_VMAC=m
  CONFIG_CRYPTO_CRC32=m
-CONFIG_CRYPTO_XXHASH=m
  CONFIG_CRYPTO_MICHAEL_MIC=m
  CONFIG_CRYPTO_RMD128=m
  CONFIG_CRYPTO_RMD160=m
@@ -739,18 +738,18 @@ CONFIG_DEBUG_SECTION_MISMATCH=y
  CONFIG_MAGIC_SYSRQ=y
  CONFIG_DEBUG_MEMORY_INIT=y
  CONFIG_PANIC_ON_OOPS=y
+CONFIG_BUG_ON_DATA_CORRUPTION=y
  CONFIG_RCU_TORTURE_TEST=m
  CONFIG_RCU_CPU_STALL_TIMEOUT=60
  CONFIG_LATENCYTOP=y
+CONFIG_FUNCTION_PROFILER=y
+CONFIG_STACK_TRACER=y
  CONFIG_SCHED_TRACER=y
  CONFIG_FTRACE_SYSCALLS=y
-CONFIG_STACK_TRACER=y
  CONFIG_BLK_DEV_IO_TRACE=y
-CONFIG_FUNCTION_PROFILER=y
  CONFIG_HIST_TRIGGERS=y
+CONFIG_S390_PTDUMP=y
  CONFIG_LKDTM=m
  CONFIG_PERCPU_TEST=m
  CONFIG_ATOMIC64_SELFTEST=y
  CONFIG_TEST_BPF=m
-CONFIG_BUG_ON_DATA_CORRUPTION=y
-CONFIG_S390_PTDUMP=y
diff --git a/arch/s390/include/asm/page.h b/arch/s390/include/asm/page.h

index 85e944f04c70ec68a64fd25622d1faf4cfdd5f90..1019efd85b9dc952126fa455dcde0fa7187ab112 100644 (file)
--- a/arch/s390/include/asm/page.h
+++ b/arch/s390/include/asm/page.h
@@ -42,7 +42,7 @@ void __storage_key_init_range(unsigned long start, unsigned long end);
  
  static inline void storage_key_init_range(unsigned long start, unsigned long end)
  {
-       if (PAGE_DEFAULT_KEY)
+       if (PAGE_DEFAULT_KEY != 0)
                 __storage_key_init_range(start, end);
  }
  
diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h

index 361ef5eda46895270f781407cbd684dd5eb1ac1e..aadb3d0e2adc767c862f4455a6a9c3949691b856 100644 (file)
--- a/arch/s390/include/asm/processor.h
+++ b/arch/s390/include/asm/processor.h
@@ -84,7 +84,6 @@ void s390_update_cpu_mhz(void);
  void cpu_detect_mhz_feature(void);
  
  extern const struct seq_operations cpuinfo_op;
-extern int sysctl_ieee_emulation_warnings;
  extern void execve_tail(void);
  extern void __bpon(void);
  
diff --git a/arch/s390/include/asm/qdio.h b/arch/s390/include/asm/qdio.h

index 71e3f0146cda084920c40b7736c01d45e5b0a594..1e3517b0518beb27d8dc7e2d7ea2b378d4f36993 100644 (file)
--- a/arch/s390/include/asm/qdio.h
+++ b/arch/s390/include/asm/qdio.h
@@ -201,7 +201,7 @@ struct slib {
   * @scount: SBAL count
   * @sflags: whole SBAL flags
   * @length: length
- * @addr: address
+ * @addr: absolute data address
  */
  struct qdio_buffer_element {
         u8 eflags;
@@ -211,7 +211,7 @@ struct qdio_buffer_element {
         u8 scount;
         u8 sflags;
         u32 length;
-       void *addr;
+       u64 addr;
  } __attribute__ ((packed, aligned(16)));
  
  /**
@@ -227,7 +227,7 @@ struct qdio_buffer {
   * @sbal: absolute SBAL address
   */
  struct sl_element {
-       unsigned long sbal;
+       u64 sbal;
  } __attribute__ ((packed));
  
  /**
diff --git a/arch/s390/include/asm/timex.h b/arch/s390/include/asm/timex.h

index 670f14a228e55bb42c4cc516b660f91431a4cb0a..6bf3a45ccfec203e2234cfa0883c5fdef6749fdc 100644 (file)
--- a/arch/s390/include/asm/timex.h
+++ b/arch/s390/include/asm/timex.h
@@ -155,7 +155,7 @@ static inline void get_tod_clock_ext(char *clk)
  
  static inline unsigned long long get_tod_clock(void)
  {
-       unsigned char clk[STORE_CLOCK_EXT_SIZE];
+       char clk[STORE_CLOCK_EXT_SIZE];
  
         get_tod_clock_ext(clk);
         return *((unsigned long long *)&clk[1]);
diff --git a/arch/x86/boot/compressed/kaslr_64.c b/arch/x86/boot/compressed/kaslr_64.c

index 748456c365f4691af041753c63d6991f9bc8b4b8..9557c5a15b91e29a6465e502599960aad87e3e60 100644 (file)
--- a/arch/x86/boot/compressed/kaslr_64.c
+++ b/arch/x86/boot/compressed/kaslr_64.c
@@ -29,9 +29,6 @@
  #define __PAGE_OFFSET __PAGE_OFFSET_BASE
  #include "../../mm/ident_map.c"
  
-/* Used by pgtable.h asm code to force instruction serialization. */
-unsigned long __force_order;
-
  /* Used to track our page table allocation area. */
  struct alloc_pgt_data {
         unsigned char *pgt_buf;
diff --git a/arch/x86/include/asm/io_bitmap.h b/arch/x86/include/asm/io_bitmap.h

index 02c6ef8f7667725b2730e049a2a51d78caec650e..07344d82e88ee6b28e4bb2040932eb1135a3b4a3 100644 (file)
--- a/arch/x86/include/asm/io_bitmap.h
+++ b/arch/x86/include/asm/io_bitmap.h
@@ -19,7 +19,14 @@ struct task_struct;
  void io_bitmap_share(struct task_struct *tsk);
  void io_bitmap_exit(void);
  
-void tss_update_io_bitmap(void);
+void native_tss_update_io_bitmap(void);
+
+#ifdef CONFIG_PARAVIRT_XXL
+#include <asm/paravirt.h>
+#else
+#define tss_update_io_bitmap native_tss_update_io_bitmap
+#endif
+
  #else
  static inline void io_bitmap_share(struct task_struct *tsk) { }
  static inline void io_bitmap_exit(void) { }
diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h

index 03946eb3e2b9e5162c6f20fff85bf1747ad8431a..2a8f2bd2e5cfe846cd82073fe7677695b05277e9 100644 (file)
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -292,6 +292,14 @@ enum x86emul_mode {
  #define X86EMUL_SMM_MASK             (1 << 6)
  #define X86EMUL_SMM_INSIDE_NMI_MASK  (1 << 7)
  
+/*
+ * fastop functions are declared as taking a never-defined fastop parameter,
+ * so they can't be called from C directly.
+ */
+struct fastop;
+
+typedef void (*fastop_t)(struct fastop *);
+
  struct x86_emulate_ctxt {
         const struct x86_emulate_ops *ops;
  
@@ -324,7 +332,10 @@ struct x86_emulate_ctxt {
         struct operand src;
         struct operand src2;
         struct operand dst;
-       int (*execute)(struct x86_emulate_ctxt *ctxt);
+       union {
+               int (*execute)(struct x86_emulate_ctxt *ctxt);
+               fastop_t fop;
+       };
         int (*check_perm)(struct x86_emulate_ctxt *ctxt);
         /*
          * The following six fields are cleared together,
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h

index 4dffbc10d3f8970dbe750cbd4f453c6bd94e56c6..98959e8cd4489877946d091a9bb5ee297790dbe3 100644 (file)
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -781,9 +781,19 @@ struct kvm_vcpu_arch {
         u64 msr_kvm_poll_control;
  
         /*
-        * Indicate whether the access faults on its page table in guest
-        * which is set when fix page fault and used to detect unhandeable
-        * instruction.
+        * Indicates the guest is trying to write a gfn that contains one or
+        * more of the PTEs used to translate the write itself, i.e. the access
+        * is changing its own translation in the guest page tables.  KVM exits
+        * to userspace if emulation of the faulting instruction fails and this
+        * flag is set, as KVM cannot make forward progress.
+        *
+        * If emulation fails for a write to guest page tables, KVM unprotects
+        * (zaps) the shadow page for the target gfn and resumes the guest to
+        * retry the non-emulatable instruction (on hardware).  Unprotecting the
+        * gfn doesn't allow forward progress for a self-changing access because
+        * doing so also zaps the translation for the gfn, i.e. retrying the
+        * instruction will hit a !PRESENT fault, which results in a new shadow
+        * page and sends KVM back to square one.
          */
         bool write_fault_to_shadow_pgtable;
  
@@ -1112,6 +1122,7 @@ struct kvm_x86_ops {
         int (*handle_exit)(struct kvm_vcpu *vcpu,
                 enum exit_fastpath_completion exit_fastpath);
         int (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
+       void (*update_emulated_instruction)(struct kvm_vcpu *vcpu);
         void (*set_interrupt_shadow)(struct kvm_vcpu *vcpu, int mask);
         u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu);
         void (*patch_hypercall)(struct kvm_vcpu *vcpu,
@@ -1136,7 +1147,7 @@ struct kvm_x86_ops {
         void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
         void (*set_virtual_apic_mode)(struct kvm_vcpu *vcpu);
         void (*set_apic_access_page_addr)(struct kvm_vcpu *vcpu, hpa_t hpa);
-       void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
+       int (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
         int (*sync_pir_to_irr)(struct kvm_vcpu *vcpu);
         int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
         int (*set_identity_map_addr)(struct kvm *kvm, u64 ident_addr);
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h

index ebe1685e92dda2bfd6795b45a92924de8a8f9451..d5e517d1c3ddc5c9ac6e594500d87524168b143d 100644 (file)
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -512,6 +512,8 @@
  #define MSR_K7_HWCR                    0xc0010015
  #define MSR_K7_HWCR_SMMLOCK_BIT                0
  #define MSR_K7_HWCR_SMMLOCK            BIT_ULL(MSR_K7_HWCR_SMMLOCK_BIT)
+#define MSR_K7_HWCR_IRPERF_EN_BIT      30
+#define MSR_K7_HWCR_IRPERF_EN          BIT_ULL(MSR_K7_HWCR_IRPERF_EN_BIT)
  #define MSR_K7_FID_VID_CTL             0xc0010041
  #define MSR_K7_FID_VID_STATUS          0xc0010042
  
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h

index 86e7317eb31f9a11c3f3b638f0882929b76af777..694d8daf498376ef7e91a1a3026638e275378cab 100644 (file)
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -295,6 +295,13 @@ static inline void write_idt_entry(gate_desc *dt, int entry, const gate_desc *g)
         PVOP_VCALL3(cpu.write_idt_entry, dt, entry, g);
  }
  
+#ifdef CONFIG_X86_IOPL_IOPERM
+static inline void tss_update_io_bitmap(void)
+{
+       PVOP_VCALL0(cpu.update_io_bitmap);
+}
+#endif
+
  static inline void paravirt_activate_mm(struct mm_struct *prev,
                                         struct mm_struct *next)
  {
diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h

index 84812964d3dd6f0ae5a68763251b4bd4f8af5f7e..732f62e04ddb851f47248013fac01bd62633aaf7 100644 (file)
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -140,6 +140,10 @@ struct pv_cpu_ops {
  
         void (*load_sp0)(unsigned long sp0);
  
+#ifdef CONFIG_X86_IOPL_IOPERM
+       void (*update_io_bitmap)(void);
+#endif
+
         void (*wbinvd)(void);
  
         /* cpuid emulation, mostly so that caps bits can be disabled */
diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h

index 2a85287b368521b317529e2bbd619287d8b85481..8521af3fef27f4fb103de37879c14724a9c3fe69 100644 (file)
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -72,7 +72,7 @@
  #define SECONDARY_EXEC_MODE_BASED_EPT_EXEC     VMCS_CONTROL_BIT(MODE_BASED_EPT_EXEC)
  #define SECONDARY_EXEC_PT_USE_GPA              VMCS_CONTROL_BIT(PT_USE_GPA)
  #define SECONDARY_EXEC_TSC_SCALING              VMCS_CONTROL_BIT(TSC_SCALING)
-#define SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE   0x04000000
+#define SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE   VMCS_CONTROL_BIT(USR_WAIT_PAUSE)
  
  #define PIN_BASED_EXT_INTR_MASK                 VMCS_CONTROL_BIT(INTR_EXITING)
  #define PIN_BASED_NMI_EXITING                   VMCS_CONTROL_BIT(NMI_EXITING)
diff --git a/arch/x86/include/asm/vmxfeatures.h b/arch/x86/include/asm/vmxfeatures.h

index a50e4a0de31538fb90e1e5c98f05f68cfc153e9e..9915990fd8cfab51a9cd287017e2d71d93f1f3f0 100644 (file)
--- a/arch/x86/include/asm/vmxfeatures.h
+++ b/arch/x86/include/asm/vmxfeatures.h
@@ -81,6 +81,7 @@
  #define VMX_FEATURE_MODE_BASED_EPT_EXEC        ( 2*32+ 22) /* "ept_mode_based_exec" Enable separate EPT EXEC bits for supervisor vs. user */
  #define VMX_FEATURE_PT_USE_GPA         ( 2*32+ 24) /* "" Processor Trace logs GPAs */
  #define VMX_FEATURE_TSC_SCALING                ( 2*32+ 25) /* Scale hardware TSC when read in guest */
+#define VMX_FEATURE_USR_WAIT_PAUSE     ( 2*32+ 26) /* Enable TPAUSE, UMONITOR, UMWAIT in guest */
  #define VMX_FEATURE_ENCLV_EXITING      ( 2*32+ 28) /* "" VM-Exit on ENCLV (leaf dependent) */
  
  #endif /* _ASM_X86_VMXFEATURES_H */
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h

index 503d3f42da1676791d2c4f4a70bfad35743daf4c..3f3f780c8c6500e1a1ea52bc0585af93699572fe 100644 (file)
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -390,6 +390,7 @@ struct kvm_sync_regs {
  #define KVM_STATE_NESTED_GUEST_MODE    0x00000001
  #define KVM_STATE_NESTED_RUN_PENDING   0x00000002
  #define KVM_STATE_NESTED_EVMCS         0x00000004
+#define KVM_STATE_NESTED_MTF_PENDING   0x00000008
  
  #define KVM_STATE_NESTED_SMM_GUEST_MODE        0x00000001
  #define KVM_STATE_NESTED_SMM_VMXON     0x00000002
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c

index ac83a0fef6285d9901bf44b2b5169ab9fef89698..1f875fbe13846a9bfc00582173e6aedc571cd3ca 100644 (file)
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -28,6 +28,7 @@
  
  static const int amd_erratum_383[];
  static const int amd_erratum_400[];
+static const int amd_erratum_1054[];
  static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum);
  
  /*
@@ -972,6 +973,15 @@ static void init_amd(struct cpuinfo_x86 *c)
         /* AMD CPUs don't reset SS attributes on SYSRET, Xen does. */
         if (!cpu_has(c, X86_FEATURE_XENPV))
                 set_cpu_bug(c, X86_BUG_SYSRET_SS_ATTRS);
+
+       /*
+        * Turn on the Instructions Retired free counter on machines not
+        * susceptible to erratum #1054 "Instructions Retired Performance
+        * Counter May Be Inaccurate".
+        */
+       if (cpu_has(c, X86_FEATURE_IRPERF) &&
+           !cpu_has_amd_erratum(c, amd_erratum_1054))
+               msr_set_bit(MSR_K7_HWCR, MSR_K7_HWCR_IRPERF_EN_BIT);
  }
  
  #ifdef CONFIG_X86_32
@@ -1099,6 +1109,10 @@ static const int amd_erratum_400[] =
  static const int amd_erratum_383[] =
         AMD_OSVW_ERRATUM(3, AMD_MODEL_RANGE(0x10, 0, 0, 0xff, 0xf));
  
+/* #1054: Instructions Retired Performance Counter May Be Inaccurate */
+static const int amd_erratum_1054[] =
+       AMD_OSVW_ERRATUM(0, AMD_MODEL_RANGE(0x17, 0, 0, 0x2f, 0xf));
+
  
  static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
  {
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c

index 52c9bfbbdb2a04dbdbd7288de4bcfe7be8f44851..4cdb123ff66a8dddd24a69987f78c37af7e28b8a 100644 (file)
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -445,7 +445,7 @@ static __always_inline void setup_pku(struct cpuinfo_x86 *c)
          * cpuid bit to be set.  We need to ensure that we
          * update that bit in this CPU's "cpu_info".
          */
-       get_cpu_cap(c);
+       set_cpu_cap(c, X86_FEATURE_OSPKE);
  }
  
  #ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c

index b3a50d962851cec722df10623dbc90e6a3f16985..52de616a806559d6e0a25a5f65e3bef721432f48 100644 (file)
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -1163,9 +1163,12 @@ static const struct sysfs_ops threshold_ops = {
         .store                  = store,
  };
  
+static void threshold_block_release(struct kobject *kobj);
+
  static struct kobj_type threshold_ktype = {
         .sysfs_ops              = &threshold_ops,
         .default_attrs          = default_attrs,
+       .release                = threshold_block_release,
  };
  
  static const char *get_name(unsigned int bank, struct threshold_block *b)
@@ -1198,8 +1201,9 @@ static const char *get_name(unsigned int bank, struct threshold_block *b)
         return buf_mcatype;
  }
  
-static int allocate_threshold_blocks(unsigned int cpu, unsigned int bank,
-                                    unsigned int block, u32 address)
+static int allocate_threshold_blocks(unsigned int cpu, struct threshold_bank *tb,
+                                    unsigned int bank, unsigned int block,
+                                    u32 address)
  {
         struct threshold_block *b = NULL;
         u32 low, high;
@@ -1243,16 +1247,12 @@ static int allocate_threshold_blocks(unsigned int cpu, unsigned int bank,
  
         INIT_LIST_HEAD(&b->miscj);
  
-       if (per_cpu(threshold_banks, cpu)[bank]->blocks) {
-               list_add(&b->miscj,
-                        &per_cpu(threshold_banks, cpu)[bank]->blocks->miscj);
-       } else {
-               per_cpu(threshold_banks, cpu)[bank]->blocks = b;
-       }
+       if (tb->blocks)
+               list_add(&b->miscj, &tb->blocks->miscj);
+       else
+               tb->blocks = b;
  
-       err = kobject_init_and_add(&b->kobj, &threshold_ktype,
-                                  per_cpu(threshold_banks, cpu)[bank]->kobj,
-                                  get_name(bank, b));
+       err = kobject_init_and_add(&b->kobj, &threshold_ktype, tb->kobj, get_name(bank, b));
         if (err)
                 goto out_free;
  recurse:
@@ -1260,7 +1260,7 @@ recurse:
         if (!address)
                 return 0;
  
-       err = allocate_threshold_blocks(cpu, bank, block, address);
+       err = allocate_threshold_blocks(cpu, tb, bank, block, address);
         if (err)
                 goto out_free;
  
@@ -1345,8 +1345,6 @@ static int threshold_create_bank(unsigned int cpu, unsigned int bank)
                 goto out_free;
         }
  
-       per_cpu(threshold_banks, cpu)[bank] = b;
-
         if (is_shared_bank(bank)) {
                 refcount_set(&b->cpus, 1);
  
@@ -1357,9 +1355,13 @@ static int threshold_create_bank(unsigned int cpu, unsigned int bank)
                 }
         }
  
-       err = allocate_threshold_blocks(cpu, bank, 0, msr_ops.misc(bank));
-       if (!err)
-               goto out;
+       err = allocate_threshold_blocks(cpu, b, bank, 0, msr_ops.misc(bank));
+       if (err)
+               goto out_free;
+
+       per_cpu(threshold_banks, cpu)[bank] = b;
+
+       return 0;
  
   out_free:
         kfree(b);
@@ -1368,8 +1370,12 @@ static int threshold_create_bank(unsigned int cpu, unsigned int bank)
         return err;
  }
  
-static void deallocate_threshold_block(unsigned int cpu,
-                                                unsigned int bank)
+static void threshold_block_release(struct kobject *kobj)
+{
+       kfree(to_block(kobj));
+}
+
+static void deallocate_threshold_block(unsigned int cpu, unsigned int bank)
  {
         struct threshold_block *pos = NULL;
         struct threshold_block *tmp = NULL;
@@ -1379,13 +1385,11 @@ static void deallocate_threshold_block(unsigned int cpu,
                 return;
  
         list_for_each_entry_safe(pos, tmp, &head->blocks->miscj, miscj) {
-               kobject_put(&pos->kobj);
                 list_del(&pos->miscj);
-               kfree(pos);
+               kobject_put(&pos->kobj);
         }
  
-       kfree(per_cpu(threshold_banks, cpu)[bank]->blocks);
-       per_cpu(threshold_banks, cpu)[bank]->blocks = NULL;
+       kobject_put(&head->blocks->kobj);
  }
  
  static void __threshold_remove_blocks(struct threshold_bank *b)
diff --git a/arch/x86/kernel/ima_arch.c b/arch/x86/kernel/ima_arch.c

index 4d4f5d9faac314ad5fe5c21a05e72c873ec42d65..23054909c8ddfa4258fc9004877f6ceb88ef0786 100644 (file)
--- a/arch/x86/kernel/ima_arch.c
+++ b/arch/x86/kernel/ima_arch.c
@@ -10,8 +10,6 @@ extern struct boot_params boot_params;
  
  static enum efi_secureboot_mode get_sb_mode(void)
  {
-       efi_char16_t efi_SecureBoot_name[] = L"SecureBoot";
-       efi_char16_t efi_SetupMode_name[] = L"SecureBoot";
         efi_guid_t efi_variable_guid = EFI_GLOBAL_VARIABLE_GUID;
         efi_status_t status;
         unsigned long size;
@@ -25,7 +23,7 @@ static enum efi_secureboot_mode get_sb_mode(void)
         }
  
         /* Get variable contents into buffer */
-       status = efi.get_variable(efi_SecureBoot_name, &efi_variable_guid,
+       status = efi.get_variable(L"SecureBoot", &efi_variable_guid,
                                   NULL, &size, &secboot);
         if (status == EFI_NOT_FOUND) {
                 pr_info("ima: secureboot mode disabled\n");
@@ -38,7 +36,7 @@ static enum efi_secureboot_mode get_sb_mode(void)
         }
  
         size = sizeof(setupmode);
-       status = efi.get_variable(efi_SetupMode_name, &efi_variable_guid,
+       status = efi.get_variable(L"SetupMode", &efi_variable_guid,
                                   NULL, &size, &setupmode);
  
         if (status != EFI_SUCCESS)      /* ignore unknown SetupMode */
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c

index d817f255aed8e0b083bb04dfbce257c8c3b02a58..6efe0410fb728995ea8944ecd984840b6f30778e 100644 (file)
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -425,7 +425,29 @@ static void __init sev_map_percpu_data(void)
         }
  }
  
+static bool pv_tlb_flush_supported(void)
+{
+       return (kvm_para_has_feature(KVM_FEATURE_PV_TLB_FLUSH) &&
+               !kvm_para_has_hint(KVM_HINTS_REALTIME) &&
+               kvm_para_has_feature(KVM_FEATURE_STEAL_TIME));
+}
+
+static DEFINE_PER_CPU(cpumask_var_t, __pv_cpu_mask);
+
  #ifdef CONFIG_SMP
+
+static bool pv_ipi_supported(void)
+{
+       return kvm_para_has_feature(KVM_FEATURE_PV_SEND_IPI);
+}
+
+static bool pv_sched_yield_supported(void)
+{
+       return (kvm_para_has_feature(KVM_FEATURE_PV_SCHED_YIELD) &&
+               !kvm_para_has_hint(KVM_HINTS_REALTIME) &&
+           kvm_para_has_feature(KVM_FEATURE_STEAL_TIME));
+}
+
  #define KVM_IPI_CLUSTER_SIZE   (2 * BITS_PER_LONG)
  
  static void __send_ipi_mask(const struct cpumask *mask, int vector)
@@ -490,12 +512,12 @@ static void kvm_send_ipi_mask(const struct cpumask *mask, int vector)
  static void kvm_send_ipi_mask_allbutself(const struct cpumask *mask, int vector)
  {
         unsigned int this_cpu = smp_processor_id();
-       struct cpumask new_mask;
+       struct cpumask *new_mask = this_cpu_cpumask_var_ptr(__pv_cpu_mask);
         const struct cpumask *local_mask;
  
-       cpumask_copy(&new_mask, mask);
-       cpumask_clear_cpu(this_cpu, &new_mask);
-       local_mask = &new_mask;
+       cpumask_copy(new_mask, mask);
+       cpumask_clear_cpu(this_cpu, new_mask);
+       local_mask = new_mask;
         __send_ipi_mask(local_mask, vector);
  }
  
@@ -575,7 +597,6 @@ static void __init kvm_apf_trap_init(void)
         update_intr_gate(X86_TRAP_PF, async_page_fault);
  }
  
-static DEFINE_PER_CPU(cpumask_var_t, __pv_tlb_mask);
  
  static void kvm_flush_tlb_others(const struct cpumask *cpumask,
                         const struct flush_tlb_info *info)
@@ -583,7 +604,7 @@ static void kvm_flush_tlb_others(const struct cpumask *cpumask,
         u8 state;
         int cpu;
         struct kvm_steal_time *src;
-       struct cpumask *flushmask = this_cpu_cpumask_var_ptr(__pv_tlb_mask);
+       struct cpumask *flushmask = this_cpu_cpumask_var_ptr(__pv_cpu_mask);
  
         cpumask_copy(flushmask, cpumask);
         /*
@@ -619,11 +640,10 @@ static void __init kvm_guest_init(void)
                 pv_ops.time.steal_clock = kvm_steal_clock;
         }
  
-       if (kvm_para_has_feature(KVM_FEATURE_PV_TLB_FLUSH) &&
-           !kvm_para_has_hint(KVM_HINTS_REALTIME) &&
-           kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) {
+       if (pv_tlb_flush_supported()) {
                 pv_ops.mmu.flush_tlb_others = kvm_flush_tlb_others;
                 pv_ops.mmu.tlb_remove_table = tlb_remove_table;
+               pr_info("KVM setup pv remote TLB flush\n");
         }
  
         if (kvm_para_has_feature(KVM_FEATURE_PV_EOI))
@@ -632,9 +652,7 @@ static void __init kvm_guest_init(void)
  #ifdef CONFIG_SMP
         smp_ops.smp_prepare_cpus = kvm_smp_prepare_cpus;
         smp_ops.smp_prepare_boot_cpu = kvm_smp_prepare_boot_cpu;
-       if (kvm_para_has_feature(KVM_FEATURE_PV_SCHED_YIELD) &&
-           !kvm_para_has_hint(KVM_HINTS_REALTIME) &&
-           kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) {
+       if (pv_sched_yield_supported()) {
                 smp_ops.send_call_func_ipi = kvm_smp_send_call_func_ipi;
                 pr_info("KVM setup pv sched yield\n");
         }
@@ -700,7 +718,7 @@ static uint32_t __init kvm_detect(void)
  static void __init kvm_apic_init(void)
  {
  #if defined(CONFIG_SMP)
-       if (kvm_para_has_feature(KVM_FEATURE_PV_SEND_IPI))
+       if (pv_ipi_supported())
                 kvm_setup_pv_ipi();
  #endif
  }
@@ -732,26 +750,31 @@ static __init int activate_jump_labels(void)
  }
  arch_initcall(activate_jump_labels);
  
-static __init int kvm_setup_pv_tlb_flush(void)
+static __init int kvm_alloc_cpumask(void)
  {
         int cpu;
+       bool alloc = false;
  
         if (!kvm_para_available() || nopv)
                 return 0;
  
-       if (kvm_para_has_feature(KVM_FEATURE_PV_TLB_FLUSH) &&
-           !kvm_para_has_hint(KVM_HINTS_REALTIME) &&
-           kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) {
+       if (pv_tlb_flush_supported())
+               alloc = true;
+
+#if defined(CONFIG_SMP)
+       if (pv_ipi_supported())
+               alloc = true;
+#endif
+
+       if (alloc)
                 for_each_possible_cpu(cpu) {
-                       zalloc_cpumask_var_node(per_cpu_ptr(&__pv_tlb_mask, cpu),
+                       zalloc_cpumask_var_node(per_cpu_ptr(&__pv_cpu_mask, cpu),
                                 GFP_KERNEL, cpu_to_node(cpu));
                 }
-               pr_info("KVM setup pv remote TLB flush\n");
-       }
  
         return 0;
  }
-arch_initcall(kvm_setup_pv_tlb_flush);
+arch_initcall(kvm_alloc_cpumask);
  
  #ifdef CONFIG_PARAVIRT_SPINLOCKS
  
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c

index 789f5e4f89defc97f1daaf801c9929d616237b55..c131ba4e70ef8229d2c9f1722f88d100f942489e 100644 (file)
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -30,6 +30,7 @@
  #include <asm/timer.h>
  #include <asm/special_insns.h>
  #include <asm/tlb.h>
+#include <asm/io_bitmap.h>
  
  /*
   * nop stub, which must not clobber anything *including the stack* to
@@ -341,6 +342,10 @@ struct paravirt_patch_template pv_ops = {
         .cpu.iret               = native_iret,
         .cpu.swapgs             = native_swapgs,
  
+#ifdef CONFIG_X86_IOPL_IOPERM
+       .cpu.update_io_bitmap   = native_tss_update_io_bitmap,
+#endif
+
         .cpu.start_context_switch       = paravirt_nop,
         .cpu.end_context_switch         = paravirt_nop,
  
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c

index 839b5244e3b7e17767fc69bc610ac7c9a487c7fd..3053c85e0e42db9abbb5261673fa78e897efe20e 100644 (file)
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -374,7 +374,7 @@ static void tss_copy_io_bitmap(struct tss_struct *tss, struct io_bitmap *iobm)
  /**
   * tss_update_io_bitmap - Update I/O bitmap before exiting to usermode
   */
-void tss_update_io_bitmap(void)
+void native_tss_update_io_bitmap(void)
  {
         struct tss_struct *tss = this_cpu_ptr(&cpu_tss_rw);
         struct thread_struct *t = &current->thread;
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig

index 991019d5eee1e03c21bd20a6e88ee209696304e0..1bb4927030afd85081a862544c7e29717fdc0fca 100644 (file)
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -59,6 +59,19 @@ config KVM
  
           If unsure, say N.
  
+config KVM_WERROR
+       bool "Compile KVM with -Werror"
+       # KASAN may cause the build to fail due to larger frames
+       default y if X86_64 && !KASAN
+       # We use the dependency on !COMPILE_TEST to not be enabled
+       # blindly in allmodconfig or allyesconfig configurations
+       depends on (X86_64 && !KASAN) || !COMPILE_TEST
+       depends on EXPERT
+       help
+         Add -Werror to the build flags for (and only for) i915.ko.
+
+         If in doubt, say "N".
+
  config KVM_INTEL
         tristate "KVM for Intel (and compatible) processors support"
         depends on KVM && IA32_FEAT_CTL
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile

index b19ef421084dff20ba0ce8b61c98c0c8878ef41b..e553f0fdd87d47dbd0fe4dc5cb79f5e5421e0e64 100644 (file)
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -1,6 +1,7 @@
  # SPDX-License-Identifier: GPL-2.0
  
  ccflags-y += -Iarch/x86/kvm
+ccflags-$(CONFIG_KVM_WERROR) += -Werror
  
  KVM := ../../../virt/kvm
  
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c

index ddbc61984227c3dbee763ed3759e9a5351f32606..dd19fb3539e0b4b7d1c51581b90695c66d79abe2 100644 (file)
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -191,25 +191,6 @@
  #define NR_FASTOP (ilog2(sizeof(ulong)) + 1)
  #define FASTOP_SIZE 8
  
-/*
- * fastop functions have a special calling convention:
- *
- * dst:    rax        (in/out)
- * src:    rdx        (in/out)
- * src2:   rcx        (in)
- * flags:  rflags     (in/out)
- * ex:     rsi        (in:fastop pointer, out:zero if exception)
- *
- * Moreover, they are all exactly FASTOP_SIZE bytes long, so functions for
- * different operand sizes can be reached by calculation, rather than a jump
- * table (which would be bigger than the code).
- *
- * fastop functions are declared as taking a never-defined fastop parameter,
- * so they can't be called from C directly.
- */
-
-struct fastop;
-
  struct opcode {
         u64 flags : 56;
         u64 intercept : 8;
@@ -311,8 +292,19 @@ static void invalidate_registers(struct x86_emulate_ctxt *ctxt)
  #define ON64(x)
  #endif
  
-typedef void (*fastop_t)(struct fastop *);
-
+/*
+ * fastop functions have a special calling convention:
+ *
+ * dst:    rax        (in/out)
+ * src:    rdx        (in/out)
+ * src2:   rcx        (in)
+ * flags:  rflags     (in/out)
+ * ex:     rsi        (in:fastop pointer, out:zero if exception)
+ *
+ * Moreover, they are all exactly FASTOP_SIZE bytes long, so functions for
+ * different operand sizes can be reached by calculation, rather than a jump
+ * table (which would be bigger than the code).
+ */
  static int fastop(struct x86_emulate_ctxt *ctxt, fastop_t fop);
  
  #define __FOP_FUNC(name) \
@@ -5683,7 +5675,7 @@ special_insn:
  
         if (ctxt->execute) {
                 if (ctxt->d & Fastop)
-                       rc = fastop(ctxt, (fastop_t)ctxt->execute);
+                       rc = fastop(ctxt, ctxt->fop);
                 else
                         rc = ctxt->execute(ctxt);
                 if (rc != X86EMUL_CONTINUE)
diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c

index 79afa0bb5f410f701e18370b38f9aeb182f53cfc..c47d2acec52934eae639a8c4b347fbfa660caca6 100644 (file)
--- a/arch/x86/kvm/irq_comm.c
+++ b/arch/x86/kvm/irq_comm.c
@@ -417,7 +417,7 @@ void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu,
  
                         kvm_set_msi_irq(vcpu->kvm, entry, &irq);
  
-                       if (irq.level &&
+                       if (irq.trig_mode &&
                             kvm_apic_match_dest(vcpu, NULL, APIC_DEST_NOSHORT,
                                                 irq.dest_id, irq.dest_mode))
                                 __set_bit(irq.vector, ioapic_handled_vectors);
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c

index eafc631d305cc1951273c8d17fb0c5d17621dd49..e3099c642fecfbbb6cd7b4df4a0df2a9b8091d73 100644 (file)
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -627,9 +627,11 @@ static inline bool pv_eoi_enabled(struct kvm_vcpu *vcpu)
  static bool pv_eoi_get_pending(struct kvm_vcpu *vcpu)
  {
         u8 val;
-       if (pv_eoi_get_user(vcpu, &val) < 0)
+       if (pv_eoi_get_user(vcpu, &val) < 0) {
                 printk(KERN_WARNING "Can't read EOI MSR value: 0x%llx\n",
                            (unsigned long long)vcpu->arch.pv_eoi.msr_val);
+               return false;
+       }
         return val & 0x1;
  }
  
@@ -1046,11 +1048,8 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
                                                        apic->regs + APIC_TMR);
                 }
  
-               if (vcpu->arch.apicv_active)
-                       kvm_x86_ops->deliver_posted_interrupt(vcpu, vector);
-               else {
+               if (kvm_x86_ops->deliver_posted_interrupt(vcpu, vector)) {
                         kvm_lapic_set_irr(vector, apic);
-
                         kvm_make_request(KVM_REQ_EVENT, vcpu);
                         kvm_vcpu_kick(vcpu);
                 }
@@ -1080,9 +1079,6 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
                         result = 1;
                         /* assumes that there are only KVM_APIC_INIT/SIPI */
                         apic->pending_events = (1UL << KVM_APIC_INIT);
-                       /* make sure pending_events is visible before sending
-                        * the request */
-                       smp_wmb();
                         kvm_make_request(KVM_REQ_EVENT, vcpu);
                         kvm_vcpu_kick(vcpu);
                 }
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h

index d55674f44a18b52ac81b9daa3088a7ba442fe2cc..a647601c9e1c1ddc06dfb030b81cf74d61b97db6 100644 (file)
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -102,6 +102,19 @@ static inline void kvm_mmu_load_cr3(struct kvm_vcpu *vcpu)
                                               kvm_get_active_pcid(vcpu));
  }
  
+int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
+                      bool prefault);
+
+static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
+                                       u32 err, bool prefault)
+{
+#ifdef CONFIG_RETPOLINE
+       if (likely(vcpu->arch.mmu->page_fault == kvm_tdp_page_fault))
+               return kvm_tdp_page_fault(vcpu, cr2_or_gpa, err, prefault);
+#endif
+       return vcpu->arch.mmu->page_fault(vcpu, cr2_or_gpa, err, prefault);
+}
+
  /*
   * Currently, we have two sorts of write-protection, a) the first one
   * write-protects guest page to sync the guest modification, b) another one is
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c

index 7011a4e54866728815cae8deb1c7bdcc5e6320ec..87e9ba27ada14bd7dc932ed518be82e0fb49321a 100644 (file)
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -4219,8 +4219,8 @@ int kvm_handle_page_fault(struct kvm_vcpu *vcpu, u64 error_code,
  }
  EXPORT_SYMBOL_GPL(kvm_handle_page_fault);
  
-static int tdp_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
-                         bool prefault)
+int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
+                      bool prefault)
  {
         int max_level;
  
@@ -4925,7 +4925,7 @@ static void init_kvm_tdp_mmu(struct kvm_vcpu *vcpu)
                 return;
  
         context->mmu_role.as_u64 = new_role.as_u64;
-       context->page_fault = tdp_page_fault;
+       context->page_fault = kvm_tdp_page_fault;
         context->sync_page = nonpaging_sync_page;
         context->invlpg = nonpaging_invlpg;
         context->update_pte = nonpaging_update_pte;
@@ -5436,9 +5436,8 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code,
         }
  
         if (r == RET_PF_INVALID) {
-               r = vcpu->arch.mmu->page_fault(vcpu, cr2_or_gpa,
-                                              lower_32_bits(error_code),
-                                              false);
+               r = kvm_mmu_do_page_fault(vcpu, cr2_or_gpa,
+                                         lower_32_bits(error_code), false);
                 WARN_ON(r == RET_PF_INVALID);
         }
  
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h

index 4e1ef047366344c8704651e0b1f3eeddbda74b51..e4c8a4cbf40706367c1ef69b1b524c14de0e27fa 100644 (file)
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -33,7 +33,7 @@
         #define PT_GUEST_ACCESSED_SHIFT PT_ACCESSED_SHIFT
         #define PT_HAVE_ACCESSED_DIRTY(mmu) true
         #ifdef CONFIG_X86_64
-       #define PT_MAX_FULL_LEVELS 4
+       #define PT_MAX_FULL_LEVELS PT64_ROOT_MAX_LEVEL
         #define CMPXCHG cmpxchg
         #else
         #define CMPXCHG cmpxchg64
diff --git a/arch/x86/kvm/mmutrace.h b/arch/x86/kvm/mmutrace.h

index 3c6522b84ff1175a64fef95e119e213c272bc873..ffcd96fc02d0a4892ee4573d9a7a2791f4cf0c69 100644 (file)
--- a/arch/x86/kvm/mmutrace.h
+++ b/arch/x86/kvm/mmutrace.h
@@ -339,7 +339,7 @@ TRACE_EVENT(
                 /* These depend on page entry type, so compute them now.  */
                 __field(bool, r)
                 __field(bool, x)
-               __field(u8, u)
+               __field(signed char, u)
         ),
  
         TP_fast_assign(
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c

index a3e32d61d60ceb2d403a425bf92aa6e5d9cae578..24c0b2ba8fb9d34e5d1b22cefe24d6b704cc9ba1 100644 (file)
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -57,11 +57,13 @@
  MODULE_AUTHOR("Qumranet");
  MODULE_LICENSE("GPL");
  
+#ifdef MODULE
  static const struct x86_cpu_id svm_cpu_id[] = {
         X86_FEATURE_MATCH(X86_FEATURE_SVM),
         {}
  };
  MODULE_DEVICE_TABLE(x86cpu, svm_cpu_id);
+#endif
  
  #define IOPM_ALLOC_ORDER 2
  #define MSRPM_ALLOC_ORDER 1
@@ -1005,33 +1007,32 @@ static void svm_cpu_uninit(int cpu)
  static int svm_cpu_init(int cpu)
  {
         struct svm_cpu_data *sd;
-       int r;
  
         sd = kzalloc(sizeof(struct svm_cpu_data), GFP_KERNEL);
         if (!sd)
                 return -ENOMEM;
         sd->cpu = cpu;
-       r = -ENOMEM;
         sd->save_area = alloc_page(GFP_KERNEL);
         if (!sd->save_area)
-               goto err_1;
+               goto free_cpu_data;
  
         if (svm_sev_enabled()) {
-               r = -ENOMEM;
                 sd->sev_vmcbs = kmalloc_array(max_sev_asid + 1,
                                               sizeof(void *),
                                               GFP_KERNEL);
                 if (!sd->sev_vmcbs)
-                       goto err_1;
+                       goto free_save_area;
         }
  
         per_cpu(svm_data, cpu) = sd;
  
         return 0;
  
-err_1:
+free_save_area:
+       __free_page(sd->save_area);
+free_cpu_data:
         kfree(sd);
-       return r;
+       return -ENOMEM;
  
  }
  
@@ -1350,6 +1351,24 @@ static __init void svm_adjust_mmio_mask(void)
         kvm_mmu_set_mmio_spte_mask(mask, mask, PT_WRITABLE_MASK | PT_USER_MASK);
  }
  
+static void svm_hardware_teardown(void)
+{
+       int cpu;
+
+       if (svm_sev_enabled()) {
+               bitmap_free(sev_asid_bitmap);
+               bitmap_free(sev_reclaim_asid_bitmap);
+
+               sev_flush_asids();
+       }
+
+       for_each_possible_cpu(cpu)
+               svm_cpu_uninit(cpu);
+
+       __free_pages(pfn_to_page(iopm_base >> PAGE_SHIFT), IOPM_ALLOC_ORDER);
+       iopm_base = 0;
+}
+
  static __init int svm_hardware_setup(void)
  {
         int cpu;
@@ -1463,29 +1482,10 @@ static __init int svm_hardware_setup(void)
         return 0;
  
  err:
-       __free_pages(iopm_pages, IOPM_ALLOC_ORDER);
-       iopm_base = 0;
+       svm_hardware_teardown();
         return r;
  }
  
-static __exit void svm_hardware_unsetup(void)
-{
-       int cpu;
-
-       if (svm_sev_enabled()) {
-               bitmap_free(sev_asid_bitmap);
-               bitmap_free(sev_reclaim_asid_bitmap);
-
-               sev_flush_asids();
-       }
-
-       for_each_possible_cpu(cpu)
-               svm_cpu_uninit(cpu);
-
-       __free_pages(pfn_to_page(iopm_base >> PAGE_SHIFT), IOPM_ALLOC_ORDER);
-       iopm_base = 0;
-}
-
  static void init_seg(struct vmcb_seg *seg)
  {
         seg->selector = 0;
@@ -2175,7 +2175,6 @@ static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
         u32 dummy;
         u32 eax = 1;
  
-       vcpu->arch.microcode_version = 0x01000065;
         svm->spec_ctrl = 0;
         svm->virt_spec_ctrl = 0;
  
@@ -2197,8 +2196,9 @@ static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
  static int avic_init_vcpu(struct vcpu_svm *svm)
  {
         int ret;
+       struct kvm_vcpu *vcpu = &svm->vcpu;
  
-       if (!kvm_vcpu_apicv_active(&svm->vcpu))
+       if (!avic || !irqchip_in_kernel(vcpu->kvm))
                 return 0;
  
         ret = avic_init_backing_page(&svm->vcpu);
@@ -2266,6 +2266,7 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
         init_vmcb(svm);
  
         svm_init_osvw(vcpu);
+       vcpu->arch.microcode_version = 0x01000065;
  
         return 0;
  
@@ -5232,6 +5233,9 @@ static void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
         struct vmcb *vmcb = svm->vmcb;
         bool activated = kvm_vcpu_apicv_active(vcpu);
  
+       if (!avic)
+               return;
+
         if (activated) {
                 /**
                  * During AVIC temporary deactivation, guest could update
@@ -5255,8 +5259,11 @@ static void svm_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
         return;
  }
  
-static void svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec)
+static int svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec)
  {
+       if (!vcpu->arch.apicv_active)
+               return -1;
+
         kvm_lapic_set_irr(vec, vcpu->arch.apic);
         smp_mb__after_atomic();
  
@@ -5268,6 +5275,8 @@ static void svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec)
                 put_cpu();
         } else
                 kvm_vcpu_wake_up(vcpu);
+
+       return 0;
  }
  
  static bool svm_dy_apicv_has_pending_interrupt(struct kvm_vcpu *vcpu)
@@ -7378,7 +7387,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
         .cpu_has_kvm_support = has_svm,
         .disabled_by_bios = is_disabled,
         .hardware_setup = svm_hardware_setup,
-       .hardware_unsetup = svm_hardware_unsetup,
+       .hardware_unsetup = svm_hardware_teardown,
         .check_processor_compatibility = svm_check_processor_compat,
         .hardware_enable = svm_hardware_enable,
         .hardware_disable = svm_hardware_disable,
@@ -7433,6 +7442,7 @@ static struct kvm_x86_ops svm_x86_ops __ro_after_init = {
         .run = svm_vcpu_run,
         .handle_exit = handle_exit,
         .skip_emulated_instruction = skip_emulated_instruction,
+       .update_emulated_instruction = NULL,
         .set_interrupt_shadow = svm_set_interrupt_shadow,
         .get_interrupt_shadow = svm_get_interrupt_shadow,
         .patch_hypercall = svm_patch_hypercall,
diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h

index 283bdb7071af604453e081f30baec3bcdd938dac..f486e260624740e8fd57683b001beb4737593c60 100644 (file)
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -12,6 +12,7 @@ extern bool __read_mostly enable_ept;
  extern bool __read_mostly enable_unrestricted_guest;
  extern bool __read_mostly enable_ept_ad_bits;
  extern bool __read_mostly enable_pml;
+extern bool __read_mostly enable_apicv;
  extern int __read_mostly pt_mode;
  
  #define PT_MODE_SYSTEM         0
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c

index 657c2eda357cdb47af83468f327045e49ee69a6c..e920d7834d736ee379722abb445b5847a82a279e 100644 (file)
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -544,7 +544,8 @@ static void nested_vmx_disable_intercept_for_msr(unsigned long *msr_bitmap_l1,
         }
  }
  
-static inline void enable_x2apic_msr_intercepts(unsigned long *msr_bitmap) {
+static inline void enable_x2apic_msr_intercepts(unsigned long *msr_bitmap)
+{
         int msr;
  
         for (msr = 0x800; msr <= 0x8ff; msr += BITS_PER_LONG) {
@@ -1981,7 +1982,7 @@ static int nested_vmx_handle_enlightened_vmptrld(struct kvm_vcpu *vcpu,
         }
  
         /*
-        * Clean fields data can't de used on VMLAUNCH and when we switch
+        * Clean fields data can't be used on VMLAUNCH and when we switch
          * between different L2 guests as KVM keeps a single VMCS12 per L1.
          */
         if (from_launch || evmcs_gpa_changed)
@@ -3160,10 +3161,10 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
   * or KVM_SET_NESTED_STATE).  Otherwise it's called from vmlaunch/vmresume.
   *
   * Returns:
- *     NVMX_ENTRY_SUCCESS: Entered VMX non-root mode
- *     NVMX_ENTRY_VMFAIL:  Consistency check VMFail
- *     NVMX_ENTRY_VMEXIT:  Consistency check VMExit
- *     NVMX_ENTRY_KVM_INTERNAL_ERROR: KVM internal error
+ *     NVMX_VMENTRY_SUCCESS: Entered VMX non-root mode
+ *     NVMX_VMENTRY_VMFAIL:  Consistency check VMFail
+ *     NVMX_VMENTRY_VMEXIT:  Consistency check VMExit
+ *     NVMX_VMENTRY_KVM_INTERNAL_ERROR: KVM internal error
   */
  enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
                                                         bool from_vmentry)
@@ -3575,25 +3576,80 @@ static void nested_vmx_inject_exception_vmexit(struct kvm_vcpu *vcpu,
         nested_vmx_vmexit(vcpu, EXIT_REASON_EXCEPTION_NMI, intr_info, exit_qual);
  }
  
+/*
+ * Returns true if a debug trap is pending delivery.
+ *
+ * In KVM, debug traps bear an exception payload. As such, the class of a #DB
+ * exception may be inferred from the presence of an exception payload.
+ */
+static inline bool vmx_pending_dbg_trap(struct kvm_vcpu *vcpu)
+{
+       return vcpu->arch.exception.pending &&
+                       vcpu->arch.exception.nr == DB_VECTOR &&
+                       vcpu->arch.exception.payload;
+}
+
+/*
+ * Certain VM-exits set the 'pending debug exceptions' field to indicate a
+ * recognized #DB (data or single-step) that has yet to be delivered. Since KVM
+ * represents these debug traps with a payload that is said to be compatible
+ * with the 'pending debug exceptions' field, write the payload to the VMCS
+ * field if a VM-exit is delivered before the debug trap.
+ */
+static void nested_vmx_update_pending_dbg(struct kvm_vcpu *vcpu)
+{
+       if (vmx_pending_dbg_trap(vcpu))
+               vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS,
+                           vcpu->arch.exception.payload);
+}
+
  static int vmx_check_nested_events(struct kvm_vcpu *vcpu, bool external_intr)
  {
         struct vcpu_vmx *vmx = to_vmx(vcpu);
         unsigned long exit_qual;
         bool block_nested_events =
             vmx->nested.nested_run_pending || kvm_event_needs_reinjection(vcpu);
+       bool mtf_pending = vmx->nested.mtf_pending;
         struct kvm_lapic *apic = vcpu->arch.apic;
  
+       /*
+        * Clear the MTF state. If a higher priority VM-exit is delivered first,
+        * this state is discarded.
+        */
+       vmx->nested.mtf_pending = false;
+
         if (lapic_in_kernel(vcpu) &&
                 test_bit(KVM_APIC_INIT, &apic->pending_events)) {
                 if (block_nested_events)
                         return -EBUSY;
+               nested_vmx_update_pending_dbg(vcpu);
                 clear_bit(KVM_APIC_INIT, &apic->pending_events);
                 nested_vmx_vmexit(vcpu, EXIT_REASON_INIT_SIGNAL, 0, 0);
                 return 0;
         }
  
+       /*
+        * Process any exceptions that are not debug traps before MTF.
+        */
+       if (vcpu->arch.exception.pending &&
+           !vmx_pending_dbg_trap(vcpu) &&
+           nested_vmx_check_exception(vcpu, &exit_qual)) {
+               if (block_nested_events)
+                       return -EBUSY;
+               nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
+               return 0;
+       }
+
+       if (mtf_pending) {
+               if (block_nested_events)
+                       return -EBUSY;
+               nested_vmx_update_pending_dbg(vcpu);
+               nested_vmx_vmexit(vcpu, EXIT_REASON_MONITOR_TRAP_FLAG, 0, 0);
+               return 0;
+       }
+
         if (vcpu->arch.exception.pending &&
-               nested_vmx_check_exception(vcpu, &exit_qual)) {
+           nested_vmx_check_exception(vcpu, &exit_qual)) {
                 if (block_nested_events)
                         return -EBUSY;
                 nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
@@ -5256,24 +5312,17 @@ fail:
         return 1;
  }
  
-
-static bool nested_vmx_exit_handled_io(struct kvm_vcpu *vcpu,
-                                      struct vmcs12 *vmcs12)
+/*
+ * Return true if an IO instruction with the specified port and size should cause
+ * a VM-exit into L1.
+ */
+bool nested_vmx_check_io_bitmaps(struct kvm_vcpu *vcpu, unsigned int port,
+                                int size)
  {
-       unsigned long exit_qualification;
+       struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
         gpa_t bitmap, last_bitmap;
-       unsigned int port;
-       int size;
         u8 b;
  
-       if (!nested_cpu_has(vmcs12, CPU_BASED_USE_IO_BITMAPS))
-               return nested_cpu_has(vmcs12, CPU_BASED_UNCOND_IO_EXITING);
-
-       exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
-
-       port = exit_qualification >> 16;
-       size = (exit_qualification & 7) + 1;
-
         last_bitmap = (gpa_t)-1;
         b = -1;
  
@@ -5300,8 +5349,26 @@ static bool nested_vmx_exit_handled_io(struct kvm_vcpu *vcpu,
         return false;
  }
  
+static bool nested_vmx_exit_handled_io(struct kvm_vcpu *vcpu,
+                                      struct vmcs12 *vmcs12)
+{
+       unsigned long exit_qualification;
+       unsigned short port;
+       int size;
+
+       if (!nested_cpu_has(vmcs12, CPU_BASED_USE_IO_BITMAPS))
+               return nested_cpu_has(vmcs12, CPU_BASED_UNCOND_IO_EXITING);
+
+       exit_qualification = vmcs_readl(EXIT_QUALIFICATION);
+
+       port = exit_qualification >> 16;
+       size = (exit_qualification & 7) + 1;
+
+       return nested_vmx_check_io_bitmaps(vcpu, port, size);
+}
+
  /*
- * Return 1 if we should exit from L2 to L1 to handle an MSR access access,
+ * Return 1 if we should exit from L2 to L1 to handle an MSR access,
   * rather than handle it ourselves in L0. I.e., check whether L1 expressed
   * disinterest in the current event (read or write a specific MSR) by using an
   * MSR bitmap. This may be the case even when L0 doesn't use MSR bitmaps.
@@ -5683,6 +5750,9 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
  
                         if (vmx->nested.nested_run_pending)
                                 kvm_state.flags |= KVM_STATE_NESTED_RUN_PENDING;
+
+                       if (vmx->nested.mtf_pending)
+                               kvm_state.flags |= KVM_STATE_NESTED_MTF_PENDING;
                 }
         }
  
@@ -5863,6 +5933,9 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
         vmx->nested.nested_run_pending =
                 !!(kvm_state->flags & KVM_STATE_NESTED_RUN_PENDING);
  
+       vmx->nested.mtf_pending =
+               !!(kvm_state->flags & KVM_STATE_NESTED_MTF_PENDING);
+
         ret = -EINVAL;
         if (nested_cpu_has_shadow_vmcs(vmcs12) &&
             vmcs12->vmcs_link_pointer != -1ull) {
@@ -5920,8 +5993,7 @@ void nested_vmx_set_vmcs_shadowing_bitmap(void)
   * bit in the high half is on if the corresponding bit in the control field
   * may be on. See also vmx_control_verify().
   */
-void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps,
-                               bool apicv)
+void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
  {
         /*
          * Note that as a general rule, the high half of the MSRs (bits in
@@ -5948,7 +6020,7 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps,
                 PIN_BASED_EXT_INTR_MASK |
                 PIN_BASED_NMI_EXITING |
                 PIN_BASED_VIRTUAL_NMIS |
-               (apicv ? PIN_BASED_POSTED_INTR : 0);
+               (enable_apicv ? PIN_BASED_POSTED_INTR : 0);
         msrs->pinbased_ctls_high |=
                 PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR |
                 PIN_BASED_VMX_PREEMPTION_TIMER;
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h

index fc874d4ead0f07613eaa95880ee623f490e04418..9aeda46f473ee380f6764b59641805e5a343fec1 100644 (file)
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -17,8 +17,7 @@ enum nvmx_vmentry_status {
  };
  
  void vmx_leave_nested(struct kvm_vcpu *vcpu);
-void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps,
-                               bool apicv);
+void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps);
  void nested_vmx_hardware_unsetup(void);
  __init int nested_vmx_hardware_setup(int (*exit_handlers[])(struct kvm_vcpu *));
  void nested_vmx_set_vmcs_shadowing_bitmap(void);
@@ -34,6 +33,8 @@ int vmx_get_vmx_msr(struct nested_vmx_msrs *msrs, u32 msr_index, u64 *pdata);
  int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
                         u32 vmx_instruction_info, bool wr, int len, gva_t *ret);
  void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu);
+bool nested_vmx_check_io_bitmaps(struct kvm_vcpu *vcpu, unsigned int port,
+                                int size);
  
  static inline struct vmcs12 *get_vmcs12(struct kvm_vcpu *vcpu)
  {
@@ -175,6 +176,11 @@ static inline bool nested_cpu_has_virtual_nmis(struct vmcs12 *vmcs12)
         return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS;
  }
  
+static inline int nested_cpu_has_mtf(struct vmcs12 *vmcs12)
+{
+       return nested_cpu_has(vmcs12, CPU_BASED_MONITOR_TRAP_FLAG);
+}
+
  static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12)
  {
         return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c

index 9a6664886f2eff53fa478088dc65d488f7806667..40b1e6138cd5ce30e62790530131d3444cdd2618 100644 (file)
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -64,11 +64,13 @@
  MODULE_AUTHOR("Qumranet");
  MODULE_LICENSE("GPL");
  
+#ifdef MODULE
  static const struct x86_cpu_id vmx_cpu_id[] = {
         X86_FEATURE_MATCH(X86_FEATURE_VMX),
         {}
  };
  MODULE_DEVICE_TABLE(x86cpu, vmx_cpu_id);
+#endif
  
  bool __read_mostly enable_vpid = 1;
  module_param_named(vpid, enable_vpid, bool, 0444);
@@ -95,7 +97,7 @@ module_param(emulate_invalid_guest_state, bool, S_IRUGO);
  static bool __read_mostly fasteoi = 1;
  module_param(fasteoi, bool, S_IRUGO);
  
-static bool __read_mostly enable_apicv = 1;
+bool __read_mostly enable_apicv = 1;
  module_param(enable_apicv, bool, S_IRUGO);
  
  /*
@@ -1175,6 +1177,10 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
                                            vmx->guest_msrs[i].mask);
  
         }
+
+       if (vmx->nested.need_vmcs12_to_shadow_sync)
+               nested_sync_vmcs12_to_shadow(vcpu);
+
         if (vmx->guest_state_loaded)
                 return;
  
@@ -1599,6 +1605,40 @@ static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
         return 1;
  }
  
+
+/*
+ * Recognizes a pending MTF VM-exit and records the nested state for later
+ * delivery.
+ */
+static void vmx_update_emulated_instruction(struct kvm_vcpu *vcpu)
+{
+       struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
+       struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+       if (!is_guest_mode(vcpu))
+               return;
+
+       /*
+        * Per the SDM, MTF takes priority over debug-trap exceptions besides
+        * T-bit traps. As instruction emulation is completed (i.e. at the
+        * instruction boundary), any #DB exception pending delivery must be a
+        * debug-trap. Record the pending MTF state to be delivered in
+        * vmx_check_nested_events().
+        */
+       if (nested_cpu_has_mtf(vmcs12) &&
+           (!vcpu->arch.exception.pending ||
+            vcpu->arch.exception.nr == DB_VECTOR))
+               vmx->nested.mtf_pending = true;
+       else
+               vmx->nested.mtf_pending = false;
+}
+
+static int vmx_skip_emulated_instruction(struct kvm_vcpu *vcpu)
+{
+       vmx_update_emulated_instruction(vcpu);
+       return skip_emulated_instruction(vcpu);
+}
+
  static void vmx_clear_hlt(struct kvm_vcpu *vcpu)
  {
         /*
@@ -2947,6 +2987,9 @@ void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
  
  static int get_ept_level(struct kvm_vcpu *vcpu)
  {
+       /* Nested EPT currently only supports 4-level walks. */
+       if (is_guest_mode(vcpu) && nested_cpu_has_ept(get_vmcs12(vcpu)))
+               return 4;
         if (cpu_has_vmx_ept_5levels() && (cpuid_maxphyaddr(vcpu) > 48))
                 return 5;
         return 4;
@@ -3815,24 +3858,29 @@ static int vmx_deliver_nested_posted_interrupt(struct kvm_vcpu *vcpu,
   * 2. If target vcpu isn't running(root mode), kick it to pick up the
   * interrupt from PIR in next vmentry.
   */
-static void vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
+static int vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
  {
         struct vcpu_vmx *vmx = to_vmx(vcpu);
         int r;
  
         r = vmx_deliver_nested_posted_interrupt(vcpu, vector);
         if (!r)
-               return;
+               return 0;
+
+       if (!vcpu->arch.apicv_active)
+               return -1;
  
         if (pi_test_and_set_pir(vector, &vmx->pi_desc))
-               return;
+               return 0;
  
         /* If a previous notification has sent the IPI, nothing to do.  */
         if (pi_test_and_set_on(&vmx->pi_desc))
-               return;
+               return 0;
  
         if (!kvm_vcpu_trigger_posted_interrupt(vcpu, false))
                 kvm_vcpu_kick(vcpu);
+
+       return 0;
  }
  
  /*
@@ -4238,7 +4286,6 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
  
         vmx->msr_ia32_umwait_control = 0;
  
-       vcpu->arch.microcode_version = 0x100000000ULL;
         vmx->vcpu.arch.regs[VCPU_REGS_RDX] = get_rdx_init_val();
         vmx->hv_deadline_tsc = -1;
         kvm_set_cr8(vcpu, 0);
@@ -6480,8 +6527,11 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
                 vmcs_write32(PLE_WINDOW, vmx->ple_window);
         }
  
-       if (vmx->nested.need_vmcs12_to_shadow_sync)
-               nested_sync_vmcs12_to_shadow(vcpu);
+       /*
+        * We did this in prepare_switch_to_guest, because it needs to
+        * be within srcu_read_lock.
+        */
+       WARN_ON_ONCE(vmx->nested.need_vmcs12_to_shadow_sync);
  
         if (kvm_register_is_dirty(vcpu, VCPU_REGS_RSP))
                 vmcs_writel(GUEST_RSP, vcpu->arch.regs[VCPU_REGS_RSP]);
@@ -6755,14 +6805,14 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
  
         if (nested)
                 nested_vmx_setup_ctls_msrs(&vmx->nested.msrs,
-                                          vmx_capability.ept,
-                                          kvm_vcpu_apicv_active(vcpu));
+                                          vmx_capability.ept);
         else
                 memset(&vmx->nested.msrs, 0, sizeof(vmx->nested.msrs));
  
         vmx->nested.posted_intr_nv = -1;
         vmx->nested.current_vmptr = -1ull;
  
+       vcpu->arch.microcode_version = 0x100000000ULL;
         vmx->msr_ia32_feature_control_valid_bits = FEAT_CTL_LOCKED;
  
         /*
@@ -6836,8 +6886,7 @@ static int __init vmx_check_processor_compat(void)
         if (setup_vmcs_config(&vmcs_conf, &vmx_cap) < 0)
                 return -EIO;
         if (nested)
-               nested_vmx_setup_ctls_msrs(&vmcs_conf.nested, vmx_cap.ept,
-                                          enable_apicv);
+               nested_vmx_setup_ctls_msrs(&vmcs_conf.nested, vmx_cap.ept);
         if (memcmp(&vmcs_config, &vmcs_conf, sizeof(struct vmcs_config)) != 0) {
                 printk(KERN_ERR "kvm: CPU %d feature inconsistency!\n",
                                 smp_processor_id());
@@ -7098,6 +7147,40 @@ static void vmx_request_immediate_exit(struct kvm_vcpu *vcpu)
         to_vmx(vcpu)->req_immediate_exit = true;
  }
  
+static int vmx_check_intercept_io(struct kvm_vcpu *vcpu,
+                                 struct x86_instruction_info *info)
+{
+       struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
+       unsigned short port;
+       bool intercept;
+       int size;
+
+       if (info->intercept == x86_intercept_in ||
+           info->intercept == x86_intercept_ins) {
+               port = info->src_val;
+               size = info->dst_bytes;
+       } else {
+               port = info->dst_val;
+               size = info->src_bytes;
+       }
+
+       /*
+        * If the 'use IO bitmaps' VM-execution control is 0, IO instruction
+        * VM-exits depend on the 'unconditional IO exiting' VM-execution
+        * control.
+        *
+        * Otherwise, IO instruction VM-exits are controlled by the IO bitmaps.
+        */
+       if (!nested_cpu_has(vmcs12, CPU_BASED_USE_IO_BITMAPS))
+               intercept = nested_cpu_has(vmcs12,
+                                          CPU_BASED_UNCOND_IO_EXITING);
+       else
+               intercept = nested_vmx_check_io_bitmaps(vcpu, port, size);
+
+       /* FIXME: produce nested vmexit and return X86EMUL_INTERCEPTED.  */
+       return intercept ? X86EMUL_UNHANDLEABLE : X86EMUL_CONTINUE;
+}
+
  static int vmx_check_intercept(struct kvm_vcpu *vcpu,
                                struct x86_instruction_info *info,
                                enum x86_intercept_stage stage)
@@ -7105,19 +7188,45 @@ static int vmx_check_intercept(struct kvm_vcpu *vcpu,
         struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
         struct x86_emulate_ctxt *ctxt = &vcpu->arch.emulate_ctxt;
  
+       switch (info->intercept) {
         /*
          * RDPID causes #UD if disabled through secondary execution controls.
          * Because it is marked as EmulateOnUD, we need to intercept it here.
          */
-       if (info->intercept == x86_intercept_rdtscp &&
-           !nested_cpu_has2(vmcs12, SECONDARY_EXEC_RDTSCP)) {
-               ctxt->exception.vector = UD_VECTOR;
-               ctxt->exception.error_code_valid = false;
-               return X86EMUL_PROPAGATE_FAULT;
-       }
+       case x86_intercept_rdtscp:
+               if (!nested_cpu_has2(vmcs12, SECONDARY_EXEC_RDTSCP)) {
+                       ctxt->exception.vector = UD_VECTOR;
+                       ctxt->exception.error_code_valid = false;
+                       return X86EMUL_PROPAGATE_FAULT;
+               }
+               break;
+
+       case x86_intercept_in:
+       case x86_intercept_ins:
+       case x86_intercept_out:
+       case x86_intercept_outs:
+               return vmx_check_intercept_io(vcpu, info);
+
+       case x86_intercept_lgdt:
+       case x86_intercept_lidt:
+       case x86_intercept_lldt:
+       case x86_intercept_ltr:
+       case x86_intercept_sgdt:
+       case x86_intercept_sidt:
+       case x86_intercept_sldt:
+       case x86_intercept_str:
+               if (!nested_cpu_has2(vmcs12, SECONDARY_EXEC_DESC))
+                       return X86EMUL_CONTINUE;
+
+               /* FIXME: produce nested vmexit and return X86EMUL_INTERCEPTED.  */
+               break;
  
         /* TODO: check more intercepts... */
-       return X86EMUL_CONTINUE;
+       default:
+               break;
+       }
+
+       return X86EMUL_UNHANDLEABLE;
  }
  
  #ifdef CONFIG_X86_64
@@ -7699,7 +7808,7 @@ static __init int hardware_setup(void)
  
         if (nested) {
                 nested_vmx_setup_ctls_msrs(&vmcs_config.nested,
-                                          vmx_capability.ept, enable_apicv);
+                                          vmx_capability.ept);
  
                 r = nested_vmx_hardware_setup(kvm_vmx_exit_handlers);
                 if (r)
@@ -7783,7 +7892,8 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {
  
         .run = vmx_vcpu_run,
         .handle_exit = vmx_handle_exit,
-       .skip_emulated_instruction = skip_emulated_instruction,
+       .skip_emulated_instruction = vmx_skip_emulated_instruction,
+       .update_emulated_instruction = vmx_update_emulated_instruction,
         .set_interrupt_shadow = vmx_set_interrupt_shadow,
         .get_interrupt_shadow = vmx_get_interrupt_shadow,
         .patch_hypercall = vmx_patch_hypercall,
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h

index 7f42cf3dcd7002bd41c3702b457bfc507c800978..e64da06c70092362ed29964fcc12c7bab6a2a228 100644 (file)
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -150,6 +150,9 @@ struct nested_vmx {
         /* L2 must run next, and mustn't decide to exit to L1. */
         bool nested_run_pending;
  
+       /* Pending MTF VM-exit into L1.  */
+       bool mtf_pending;
+
         struct loaded_vmcs vmcs02;
  
         /*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c

index fbabb2f06273b831ad15a97e4be9289ac3c0e15d..5de200663f51476b025f801ee6b3dbf64e8d3228 100644 (file)
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -438,6 +438,14 @@ void kvm_deliver_exception_payload(struct kvm_vcpu *vcpu)
                  * for #DB exceptions under VMX.
                  */
                 vcpu->arch.dr6 ^= payload & DR6_RTM;
+
+               /*
+                * The #DB payload is defined as compatible with the 'pending
+                * debug exceptions' field under VMX, not DR6. While bit 12 is
+                * defined in the 'pending debug exceptions' field (enabled
+                * breakpoint), it is reserved and must be zero in DR6.
+                */
+               vcpu->arch.dr6 &= ~BIT(12);
                 break;
         case PF_VECTOR:
                 vcpu->arch.cr2 = payload;
@@ -490,19 +498,7 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu,
                 vcpu->arch.exception.error_code = error_code;
                 vcpu->arch.exception.has_payload = has_payload;
                 vcpu->arch.exception.payload = payload;
-               /*
-                * In guest mode, payload delivery should be deferred,
-                * so that the L1 hypervisor can intercept #PF before
-                * CR2 is modified (or intercept #DB before DR6 is
-                * modified under nVMX).  However, for ABI
-                * compatibility with KVM_GET_VCPU_EVENTS and
-                * KVM_SET_VCPU_EVENTS, we can't delay payload
-                * delivery unless userspace has enabled this
-                * functionality via the per-VM capability,
-                * KVM_CAP_EXCEPTION_PAYLOAD.
-                */
-               if (!vcpu->kvm->arch.exception_payload_enabled ||
-                   !is_guest_mode(vcpu))
+               if (!is_guest_mode(vcpu))
                         kvm_deliver_exception_payload(vcpu);
                 return;
         }
@@ -2448,7 +2444,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
         vcpu->hv_clock.tsc_timestamp = tsc_timestamp;
         vcpu->hv_clock.system_time = kernel_ns + v->kvm->arch.kvmclock_offset;
         vcpu->last_guest_tsc = tsc_timestamp;
-       WARN_ON(vcpu->hv_clock.system_time < 0);
+       WARN_ON((s64)vcpu->hv_clock.system_time < 0);
  
         /* If the host uses TSC clocksource, then it is stable */
         pvclock_flags = 0;
@@ -3795,6 +3791,21 @@ static void kvm_vcpu_ioctl_x86_get_vcpu_events(struct kvm_vcpu *vcpu,
  {
         process_nmi(vcpu);
  
+       /*
+        * In guest mode, payload delivery should be deferred,
+        * so that the L1 hypervisor can intercept #PF before
+        * CR2 is modified (or intercept #DB before DR6 is
+        * modified under nVMX). Unless the per-VM capability,
+        * KVM_CAP_EXCEPTION_PAYLOAD, is set, we may not defer the delivery of
+        * an exception payload and handle after a KVM_GET_VCPU_EVENTS. Since we
+        * opportunistically defer the exception payload, deliver it if the
+        * capability hasn't been requested before processing a
+        * KVM_GET_VCPU_EVENTS.
+        */
+       if (!vcpu->kvm->arch.exception_payload_enabled &&
+           vcpu->arch.exception.pending && vcpu->arch.exception.has_payload)
+               kvm_deliver_exception_payload(vcpu);
+
         /*
          * The API doesn't provide the instruction length for software
          * exceptions, so don't report them. As long as the guest RIP
@@ -6880,6 +6891,8 @@ restart:
                         kvm_rip_write(vcpu, ctxt->eip);
                         if (r && ctxt->tf)
                                 r = kvm_vcpu_do_singlestep(vcpu);
+                       if (kvm_x86_ops->update_emulated_instruction)
+                               kvm_x86_ops->update_emulated_instruction(vcpu);
                         __kvm_set_rflags(vcpu, ctxt->eflags);
                 }
  
@@ -7177,15 +7190,15 @@ static void kvm_timer_init(void)
  
         if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) {
  #ifdef CONFIG_CPU_FREQ
-               struct cpufreq_policy policy;
+               struct cpufreq_policy *policy;
                 int cpu;
  
-               memset(&policy, 0, sizeof(policy));
                 cpu = get_cpu();
-               cpufreq_get_policy(&policy, cpu);
-               if (policy.cpuinfo.max_freq)
-                       max_tsc_khz = policy.cpuinfo.max_freq;
+               policy = cpufreq_cpu_get(cpu);
+               if (policy && policy->cpuinfo.max_freq)
+                       max_tsc_khz = policy->cpuinfo.max_freq;
                 put_cpu();
+               cpufreq_cpu_put(policy);
  #endif
                 cpufreq_register_notifier(&kvmclock_cpufreq_notifier_block,
                                           CPUFREQ_TRANSITION_NOTIFIER);
@@ -7295,12 +7308,12 @@ int kvm_arch_init(void *opaque)
         }
  
         if (!ops->cpu_has_kvm_support()) {
-               printk(KERN_ERR "kvm: no hardware support\n");
+               pr_err_ratelimited("kvm: no hardware support\n");
                 r = -EOPNOTSUPP;
                 goto out;
         }
         if (ops->disabled_by_bios()) {
-               printk(KERN_ERR "kvm: disabled by bios\n");
+               pr_err_ratelimited("kvm: disabled by bios\n");
                 r = -EOPNOTSUPP;
                 goto out;
         }
@@ -8942,7 +8955,6 @@ int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int idt_index,
  
         kvm_rip_write(vcpu, ctxt->eip);
         kvm_set_rflags(vcpu, ctxt->eflags);
-       kvm_make_request(KVM_REQ_EVENT, vcpu);
         return 1;
  }
  EXPORT_SYMBOL_GPL(kvm_task_switch);
@@ -10182,7 +10194,7 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work)
               work->arch.cr3 != vcpu->arch.mmu->get_cr3(vcpu))
                 return;
  
-       vcpu->arch.mmu->page_fault(vcpu, work->cr2_or_gpa, 0, true);
+       kvm_mmu_do_page_fault(vcpu, work->cr2_or_gpa, 0, true);
  }
  
  static inline u32 kvm_async_pf_hash_fn(gfn_t gfn)
diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c

index 64229dad7eab6ce868957491431b62b51a2bee95..69309cd56fdf3fb28c8b7732297471ba69f46686 100644 (file)
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -363,13 +363,8 @@ static void ptdump_walk_pgd_level_core(struct seq_file *m,
  {
         const struct ptdump_range ptdump_ranges[] = {
  #ifdef CONFIG_X86_64
-
-#define normalize_addr_shift (64 - (__VIRTUAL_MASK_SHIFT + 1))
-#define normalize_addr(u) ((signed long)((u) << normalize_addr_shift) >> \
-                          normalize_addr_shift)
-
         {0, PTRS_PER_PGD * PGD_LEVEL_MULT / 2},
-       {normalize_addr(PTRS_PER_PGD * PGD_LEVEL_MULT / 2), ~0UL},
+       {GUARD_HOLE_END_ADDR, ~0UL},
  #else
         {0, ~0UL},
  #endif
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c

index fa8506e76bbeba3eb7b2efc75ff96767d41f6b6b..d19a2edd63cb224e33408d094d0df58f6ed3508c 100644 (file)
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -180,7 +180,7 @@ void efi_sync_low_kernel_mappings(void)
  static inline phys_addr_t
  virt_to_phys_or_null_size(void *va, unsigned long size)
  {
-       bool bad_size;
+       phys_addr_t pa;
  
         if (!va)
                 return 0;
@@ -188,16 +188,13 @@ virt_to_phys_or_null_size(void *va, unsigned long size)
         if (virt_addr_valid(va))
                 return virt_to_phys(va);
  
-       /*
-        * A fully aligned variable on the stack is guaranteed not to
-        * cross a page bounary. Try to catch strings on the stack by
-        * checking that 'size' is a power of two.
-        */
-       bad_size = size > PAGE_SIZE || !is_power_of_2(size);
+       pa = slow_virt_to_phys(va);
  
-       WARN_ON(!IS_ALIGNED((unsigned long)va, size) || bad_size);
+       /* check if the object crosses a page boundary */
+       if (WARN_ON((pa ^ (pa + size - 1)) & PAGE_MASK))
+               return 0;
  
-       return slow_virt_to_phys(va);
+       return pa;
  }
  
  #define virt_to_phys_or_null(addr)                             \
@@ -568,85 +565,25 @@ efi_thunk_set_virtual_address_map(unsigned long memory_map_size,
  
  static efi_status_t efi_thunk_get_time(efi_time_t *tm, efi_time_cap_t *tc)
  {
-       efi_status_t status;
-       u32 phys_tm, phys_tc;
-       unsigned long flags;
-
-       spin_lock(&rtc_lock);
-       spin_lock_irqsave(&efi_runtime_lock, flags);
-
-       phys_tm = virt_to_phys_or_null(tm);
-       phys_tc = virt_to_phys_or_null(tc);
-
-       status = efi_thunk(get_time, phys_tm, phys_tc);
-
-       spin_unlock_irqrestore(&efi_runtime_lock, flags);
-       spin_unlock(&rtc_lock);
-
-       return status;
+       return EFI_UNSUPPORTED;
  }
  
  static efi_status_t efi_thunk_set_time(efi_time_t *tm)
  {
-       efi_status_t status;
-       u32 phys_tm;
-       unsigned long flags;
-
-       spin_lock(&rtc_lock);
-       spin_lock_irqsave(&efi_runtime_lock, flags);
-
-       phys_tm = virt_to_phys_or_null(tm);
-
-       status = efi_thunk(set_time, phys_tm);
-
-       spin_unlock_irqrestore(&efi_runtime_lock, flags);
-       spin_unlock(&rtc_lock);
-
-       return status;
+       return EFI_UNSUPPORTED;
  }
  
  static efi_status_t
  efi_thunk_get_wakeup_time(efi_bool_t *enabled, efi_bool_t *pending,
                           efi_time_t *tm)
  {
-       efi_status_t status;
-       u32 phys_enabled, phys_pending, phys_tm;
-       unsigned long flags;
-
-       spin_lock(&rtc_lock);
-       spin_lock_irqsave(&efi_runtime_lock, flags);
-
-       phys_enabled = virt_to_phys_or_null(enabled);
-       phys_pending = virt_to_phys_or_null(pending);
-       phys_tm = virt_to_phys_or_null(tm);
-
-       status = efi_thunk(get_wakeup_time, phys_enabled,
-                            phys_pending, phys_tm);
-
-       spin_unlock_irqrestore(&efi_runtime_lock, flags);
-       spin_unlock(&rtc_lock);
-
-       return status;
+       return EFI_UNSUPPORTED;
  }
  
  static efi_status_t
  efi_thunk_set_wakeup_time(efi_bool_t enabled, efi_time_t *tm)
  {
-       efi_status_t status;
-       u32 phys_tm;
-       unsigned long flags;
-
-       spin_lock(&rtc_lock);
-       spin_lock_irqsave(&efi_runtime_lock, flags);
-
-       phys_tm = virt_to_phys_or_null(tm);
-
-       status = efi_thunk(set_wakeup_time, enabled, phys_tm);
-
-       spin_unlock_irqrestore(&efi_runtime_lock, flags);
-       spin_unlock(&rtc_lock);
-
-       return status;
+       return EFI_UNSUPPORTED;
  }
  
  static unsigned long efi_name_size(efi_char16_t *name)
@@ -658,6 +595,8 @@ static efi_status_t
  efi_thunk_get_variable(efi_char16_t *name, efi_guid_t *vendor,
                        u32 *attr, unsigned long *data_size, void *data)
  {
+       u8 buf[24] __aligned(8);
+       efi_guid_t *vnd = PTR_ALIGN((efi_guid_t *)buf, sizeof(*vnd));
         efi_status_t status;
         u32 phys_name, phys_vendor, phys_attr;
         u32 phys_data_size, phys_data;
@@ -665,14 +604,19 @@ efi_thunk_get_variable(efi_char16_t *name, efi_guid_t *vendor,
  
         spin_lock_irqsave(&efi_runtime_lock, flags);
  
+       *vnd = *vendor;
+
         phys_data_size = virt_to_phys_or_null(data_size);
-       phys_vendor = virt_to_phys_or_null(vendor);
+       phys_vendor = virt_to_phys_or_null(vnd);
         phys_name = virt_to_phys_or_null_size(name, efi_name_size(name));
         phys_attr = virt_to_phys_or_null(attr);
         phys_data = virt_to_phys_or_null_size(data, *data_size);
  
-       status = efi_thunk(get_variable, phys_name, phys_vendor,
-                          phys_attr, phys_data_size, phys_data);
+       if (!phys_name || (data && !phys_data))
+               status = EFI_INVALID_PARAMETER;
+       else
+               status = efi_thunk(get_variable, phys_name, phys_vendor,
+                                  phys_attr, phys_data_size, phys_data);
  
         spin_unlock_irqrestore(&efi_runtime_lock, flags);
  
@@ -683,19 +627,25 @@ static efi_status_t
  efi_thunk_set_variable(efi_char16_t *name, efi_guid_t *vendor,
                        u32 attr, unsigned long data_size, void *data)
  {
+       u8 buf[24] __aligned(8);
+       efi_guid_t *vnd = PTR_ALIGN((efi_guid_t *)buf, sizeof(*vnd));
         u32 phys_name, phys_vendor, phys_data;
         efi_status_t status;
         unsigned long flags;
  
         spin_lock_irqsave(&efi_runtime_lock, flags);
  
+       *vnd = *vendor;
+
         phys_name = virt_to_phys_or_null_size(name, efi_name_size(name));
-       phys_vendor = virt_to_phys_or_null(vendor);
+       phys_vendor = virt_to_phys_or_null(vnd);
         phys_data = virt_to_phys_or_null_size(data, data_size);
  
-       /* If data_size is > sizeof(u32) we've got problems */
-       status = efi_thunk(set_variable, phys_name, phys_vendor,
-                          attr, data_size, phys_data);
+       if (!phys_name || !phys_data)
+               status = EFI_INVALID_PARAMETER;
+       else
+               status = efi_thunk(set_variable, phys_name, phys_vendor,
+                                  attr, data_size, phys_data);
  
         spin_unlock_irqrestore(&efi_runtime_lock, flags);
  
@@ -707,6 +657,8 @@ efi_thunk_set_variable_nonblocking(efi_char16_t *name, efi_guid_t *vendor,
                                    u32 attr, unsigned long data_size,
                                    void *data)
  {
+       u8 buf[24] __aligned(8);
+       efi_guid_t *vnd = PTR_ALIGN((efi_guid_t *)buf, sizeof(*vnd));
         u32 phys_name, phys_vendor, phys_data;
         efi_status_t status;
         unsigned long flags;
@@ -714,13 +666,17 @@ efi_thunk_set_variable_nonblocking(efi_char16_t *name, efi_guid_t *vendor,
         if (!spin_trylock_irqsave(&efi_runtime_lock, flags))
                 return EFI_NOT_READY;
  
+       *vnd = *vendor;
+
         phys_name = virt_to_phys_or_null_size(name, efi_name_size(name));
-       phys_vendor = virt_to_phys_or_null(vendor);
+       phys_vendor = virt_to_phys_or_null(vnd);
         phys_data = virt_to_phys_or_null_size(data, data_size);
  
-       /* If data_size is > sizeof(u32) we've got problems */
-       status = efi_thunk(set_variable, phys_name, phys_vendor,
-                          attr, data_size, phys_data);
+       if (!phys_name || !phys_data)
+               status = EFI_INVALID_PARAMETER;
+       else
+               status = efi_thunk(set_variable, phys_name, phys_vendor,
+                                  attr, data_size, phys_data);
  
         spin_unlock_irqrestore(&efi_runtime_lock, flags);
  
@@ -732,39 +688,36 @@ efi_thunk_get_next_variable(unsigned long *name_size,
                             efi_char16_t *name,
                             efi_guid_t *vendor)
  {
+       u8 buf[24] __aligned(8);
+       efi_guid_t *vnd = PTR_ALIGN((efi_guid_t *)buf, sizeof(*vnd));
         efi_status_t status;
         u32 phys_name_size, phys_name, phys_vendor;
         unsigned long flags;
  
         spin_lock_irqsave(&efi_runtime_lock, flags);
  
+       *vnd = *vendor;
+
         phys_name_size = virt_to_phys_or_null(name_size);
-       phys_vendor = virt_to_phys_or_null(vendor);
+       phys_vendor = virt_to_phys_or_null(vnd);
         phys_name = virt_to_phys_or_null_size(name, *name_size);
  
-       status = efi_thunk(get_next_variable, phys_name_size,
-                          phys_name, phys_vendor);
+       if (!phys_name)
+               status = EFI_INVALID_PARAMETER;
+       else
+               status = efi_thunk(get_next_variable, phys_name_size,
+                                  phys_name, phys_vendor);
  
         spin_unlock_irqrestore(&efi_runtime_lock, flags);
  
+       *vendor = *vnd;
         return status;
  }
  
  static efi_status_t
  efi_thunk_get_next_high_mono_count(u32 *count)
  {
-       efi_status_t status;
-       u32 phys_count;
-       unsigned long flags;
-
-       spin_lock_irqsave(&efi_runtime_lock, flags);
-
-       phys_count = virt_to_phys_or_null(count);
-       status = efi_thunk(get_next_high_mono_count, phys_count);
-
-       spin_unlock_irqrestore(&efi_runtime_lock, flags);
-
-       return status;
+       return EFI_UNSUPPORTED;
  }
  
  static void
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c

index 1f756ffffe8b3159dade936c5eecda8d847900b8..507f4fb88fa7fd184d1bba278a9ebaa8c8039f37 100644 (file)
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -72,6 +72,9 @@
  #include <asm/mwait.h>
  #include <asm/pci_x86.h>
  #include <asm/cpu.h>
+#ifdef CONFIG_X86_IOPL_IOPERM
+#include <asm/io_bitmap.h>
+#endif
  
  #ifdef CONFIG_ACPI
  #include <linux/acpi.h>
@@ -837,6 +840,25 @@ static void xen_load_sp0(unsigned long sp0)
         this_cpu_write(cpu_tss_rw.x86_tss.sp0, sp0);
  }
  
+#ifdef CONFIG_X86_IOPL_IOPERM
+static void xen_update_io_bitmap(void)
+{
+       struct physdev_set_iobitmap iobitmap;
+       struct tss_struct *tss = this_cpu_ptr(&cpu_tss_rw);
+
+       native_tss_update_io_bitmap();
+
+       iobitmap.bitmap = (uint8_t *)(&tss->x86_tss) +
+                         tss->x86_tss.io_bitmap_base;
+       if (tss->x86_tss.io_bitmap_base == IO_BITMAP_OFFSET_INVALID)
+               iobitmap.nr_ports = 0;
+       else
+               iobitmap.nr_ports = IO_BITMAP_BITS;
+
+       HYPERVISOR_physdev_op(PHYSDEVOP_set_iobitmap, &iobitmap);
+}
+#endif
+
  static void xen_io_delay(void)
  {
  }
@@ -896,14 +918,15 @@ static u64 xen_read_msr_safe(unsigned int msr, int *err)
  static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high)
  {
         int ret;
+#ifdef CONFIG_X86_64
+       unsigned int which;
+       u64 base;
+#endif
  
         ret = 0;
  
         switch (msr) {
  #ifdef CONFIG_X86_64
-               unsigned which;
-               u64 base;
-
         case MSR_FS_BASE:               which = SEGBASE_FS; goto set;
         case MSR_KERNEL_GS_BASE:        which = SEGBASE_GS_USER; goto set;
         case MSR_GS_BASE:               which = SEGBASE_GS_KERNEL; goto set;
@@ -1046,6 +1069,9 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
         .write_idt_entry = xen_write_idt_entry,
         .load_sp0 = xen_load_sp0,
  
+#ifdef CONFIG_X86_IOPL_IOPERM
+       .update_io_bitmap = xen_update_io_bitmap,
+#endif
         .io_delay = xen_io_delay,
  
         /* Xen takes care of %gs when switching to usermode for us */
diff --git a/block/blk-flush.c b/block/blk-flush.c

index 3f977c517960e61dca951551d437b52abcacd588..5cc775bdb06acbfac9dbfa79551d97f28c5a97cb 100644 (file)
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -412,7 +412,7 @@ void blk_insert_flush(struct request *rq)
          */
         if ((policy & REQ_FSEQ_DATA) &&
             !(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) {
-               blk_mq_request_bypass_insert(rq, false);
+               blk_mq_request_bypass_insert(rq, false, false);
                 return;
         }
  
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c

index ca22afd47b3dcce1ea72da7f4a6218c4ac9d85b5..856356b1619e83f05fc86fc37ed4a9086b413d86 100644 (file)
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -361,13 +361,19 @@ static bool blk_mq_sched_bypass_insert(struct blk_mq_hw_ctx *hctx,
                                        bool has_sched,
                                        struct request *rq)
  {
-       /* dispatch flush rq directly */
-       if (rq->rq_flags & RQF_FLUSH_SEQ) {
-               spin_lock(&hctx->lock);
-               list_add(&rq->queuelist, &hctx->dispatch);
-               spin_unlock(&hctx->lock);
+       /*
+        * dispatch flush and passthrough rq directly
+        *
+        * passthrough request has to be added to hctx->dispatch directly.
+        * For some reason, device may be in one situation which can't
+        * handle FS request, so STS_RESOURCE is always returned and the
+        * FS request will be added to hctx->dispatch. However passthrough
+        * request may be required at that time for fixing the problem. If
+        * passthrough request is added to scheduler queue, there isn't any
+        * chance to dispatch it given we prioritize requests in hctx->dispatch.
+        */
+       if ((rq->rq_flags & RQF_FLUSH_SEQ) || blk_rq_is_passthrough(rq))
                 return true;
-       }
  
         if (has_sched)
                 rq->rq_flags |= RQF_SORTED;
@@ -391,8 +397,10 @@ void blk_mq_sched_insert_request(struct request *rq, bool at_head,
  
         WARN_ON(e && (rq->tag != -1));
  
-       if (blk_mq_sched_bypass_insert(hctx, !!e, rq))
+       if (blk_mq_sched_bypass_insert(hctx, !!e, rq)) {
+               blk_mq_request_bypass_insert(rq, at_head, false);
                 goto run;
+       }
  
         if (e && e->type->ops.insert_requests) {
                 LIST_HEAD(list);
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c

index fbacde454718583555b38f07d0fe88ad89b712c3..586c9d6e904ab8aade50bea050a5a973bb2a273e 100644 (file)
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -183,8 +183,8 @@ found_tag:
         return tag + tag_offset;
  }
  
-void blk_mq_put_tag(struct blk_mq_hw_ctx *hctx, struct blk_mq_tags *tags,
-                   struct blk_mq_ctx *ctx, unsigned int tag)
+void blk_mq_put_tag(struct blk_mq_tags *tags, struct blk_mq_ctx *ctx,
+                   unsigned int tag)
  {
         if (!blk_mq_tag_is_reserved(tags, tag)) {
                 const int real_tag = tag - tags->nr_reserved_tags;
diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h

index 15bc74acb57eca1c56bf0564cb000c002ea08618..2b8321efb68206cce7a865e801bf5d980629a487 100644 (file)
--- a/block/blk-mq-tag.h
+++ b/block/blk-mq-tag.h
@@ -26,8 +26,8 @@ extern struct blk_mq_tags *blk_mq_init_tags(unsigned int nr_tags, unsigned int r
  extern void blk_mq_free_tags(struct blk_mq_tags *tags);
  
  extern unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data);
-extern void blk_mq_put_tag(struct blk_mq_hw_ctx *hctx, struct blk_mq_tags *tags,
-                          struct blk_mq_ctx *ctx, unsigned int tag);
+extern void blk_mq_put_tag(struct blk_mq_tags *tags, struct blk_mq_ctx *ctx,
+                          unsigned int tag);
  extern int blk_mq_tag_update_depth(struct blk_mq_hw_ctx *hctx,
                                         struct blk_mq_tags **tags,
                                         unsigned int depth, bool can_grow);
diff --git a/block/blk-mq.c b/block/blk-mq.c

index a12b1763508d3194853b6d33a057a62da62ea0e0..d92088dec6c359afd83642af362372b651b3d7e9 100644 (file)
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -477,9 +477,9 @@ static void __blk_mq_free_request(struct request *rq)
         blk_pm_mark_last_busy(rq);
         rq->mq_hctx = NULL;
         if (rq->tag != -1)
-               blk_mq_put_tag(hctx, hctx->tags, ctx, rq->tag);
+               blk_mq_put_tag(hctx->tags, ctx, rq->tag);
         if (sched_tag != -1)
-               blk_mq_put_tag(hctx, hctx->sched_tags, ctx, sched_tag);
+               blk_mq_put_tag(hctx->sched_tags, ctx, sched_tag);
         blk_mq_sched_restart(hctx);
         blk_queue_exit(q);
  }
@@ -735,7 +735,7 @@ static void blk_mq_requeue_work(struct work_struct *work)
                  * merge.
                  */
                 if (rq->rq_flags & RQF_DONTPREP)
-                       blk_mq_request_bypass_insert(rq, false);
+                       blk_mq_request_bypass_insert(rq, false, false);
                 else
                         blk_mq_sched_insert_request(rq, true, false, false);
         }
@@ -1286,7 +1286,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
                         q->mq_ops->commit_rqs(hctx);
  
                 spin_lock(&hctx->lock);
-               list_splice_init(list, &hctx->dispatch);
+               list_splice_tail_init(list, &hctx->dispatch);
                 spin_unlock(&hctx->lock);
  
                 /*
@@ -1677,12 +1677,16 @@ void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
   * Should only be used carefully, when the caller knows we want to
   * bypass a potential IO scheduler on the target device.
   */
-void blk_mq_request_bypass_insert(struct request *rq, bool run_queue)
+void blk_mq_request_bypass_insert(struct request *rq, bool at_head,
+                                 bool run_queue)
  {
         struct blk_mq_hw_ctx *hctx = rq->mq_hctx;
  
         spin_lock(&hctx->lock);
-       list_add_tail(&rq->queuelist, &hctx->dispatch);
+       if (at_head)
+               list_add(&rq->queuelist, &hctx->dispatch);
+       else
+               list_add_tail(&rq->queuelist, &hctx->dispatch);
         spin_unlock(&hctx->lock);
  
         if (run_queue)
@@ -1849,7 +1853,7 @@ insert:
         if (bypass_insert)
                 return BLK_STS_RESOURCE;
  
-       blk_mq_request_bypass_insert(rq, run_queue);
+       blk_mq_request_bypass_insert(rq, false, run_queue);
         return BLK_STS_OK;
  }
  
@@ -1876,7 +1880,7 @@ static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
  
         ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false, true);
         if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE)
-               blk_mq_request_bypass_insert(rq, true);
+               blk_mq_request_bypass_insert(rq, false, true);
         else if (ret != BLK_STS_OK)
                 blk_mq_end_request(rq, ret);
  
@@ -1910,7 +1914,7 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
                 if (ret != BLK_STS_OK) {
                         if (ret == BLK_STS_RESOURCE ||
                                         ret == BLK_STS_DEV_RESOURCE) {
-                               blk_mq_request_bypass_insert(rq,
+                               blk_mq_request_bypass_insert(rq, false,
                                                         list_empty(list));
                                 break;
                         }
@@ -3398,7 +3402,6 @@ static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb)
  }
  
  static unsigned long blk_mq_poll_nsecs(struct request_queue *q,
-                                      struct blk_mq_hw_ctx *hctx,
                                        struct request *rq)
  {
         unsigned long ret = 0;
@@ -3431,7 +3434,6 @@ static unsigned long blk_mq_poll_nsecs(struct request_queue *q,
  }
  
  static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
-                                    struct blk_mq_hw_ctx *hctx,
                                      struct request *rq)
  {
         struct hrtimer_sleeper hs;
@@ -3451,7 +3453,7 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
         if (q->poll_nsec > 0)
                 nsecs = q->poll_nsec;
         else
-               nsecs = blk_mq_poll_nsecs(q, hctx, rq);
+               nsecs = blk_mq_poll_nsecs(q, rq);
  
         if (!nsecs)
                 return false;
@@ -3506,7 +3508,7 @@ static bool blk_mq_poll_hybrid(struct request_queue *q,
                         return false;
         }
  
-       return blk_mq_poll_hybrid_sleep(q, hctx, rq);
+       return blk_mq_poll_hybrid_sleep(q, rq);
  }
  
  /**
diff --git a/block/blk-mq.h b/block/blk-mq.h

index eaaca8fc1c2874e77e9fce95964444ddaf280745..10bfdfb494faf4035f8be143f1cf6e47e32fe50c 100644 (file)
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -66,7 +66,8 @@ int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
   */
  void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
                                 bool at_head);
-void blk_mq_request_bypass_insert(struct request *rq, bool run_queue);
+void blk_mq_request_bypass_insert(struct request *rq, bool at_head,
+                                 bool run_queue);
  void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
                                 struct list_head *list);
  
@@ -199,7 +200,7 @@ static inline bool blk_mq_get_dispatch_budget(struct blk_mq_hw_ctx *hctx)
  static inline void __blk_mq_put_driver_tag(struct blk_mq_hw_ctx *hctx,
                                            struct request *rq)
  {
-       blk_mq_put_tag(hctx, hctx->tags, rq->mq_ctx, rq->tag);
+       blk_mq_put_tag(hctx->tags, rq->mq_ctx, rq->tag);
         rq->tag = -1;
  
         if (rq->rq_flags & RQF_MQ_INFLIGHT) {
diff --git a/crypto/Kconfig b/crypto/Kconfig

index cdb51d4272d0cc7c972fe0aa1db5a612af4cc89c..c24a47406f8f57b7550a98a20654fde836d212d0 100644 (file)
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -136,8 +136,6 @@ config CRYPTO_USER
           Userspace configuration for cryptographic instantiations such as
           cbc(aes).
  
-if CRYPTO_MANAGER2
-
  config CRYPTO_MANAGER_DISABLE_TESTS
         bool "Disable run-time self tests"
         default y
@@ -155,8 +153,6 @@ config CRYPTO_MANAGER_EXTRA_TESTS
           This is intended for developer use only, as these tests take much
           longer to run than the normal self tests.
  
-endif  # if CRYPTO_MANAGER2
-
  config CRYPTO_GF128MUL
         tristate
  
diff --git a/crypto/hash_info.c b/crypto/hash_info.c

index c754cb75dd1a96a5f52b39ed17adc6be45347345..a49ff96bde7784f9095ac7ecab8ff65124b7b17c 100644 (file)
--- a/crypto/hash_info.c
+++ b/crypto/hash_info.c
@@ -26,7 +26,7 @@ const char *const hash_algo_name[HASH_ALGO__LAST] = {
         [HASH_ALGO_TGR_128]     = "tgr128",
         [HASH_ALGO_TGR_160]     = "tgr160",
         [HASH_ALGO_TGR_192]     = "tgr192",
-       [HASH_ALGO_SM3_256]     = "sm3-256",
+       [HASH_ALGO_SM3_256]     = "sm3",
         [HASH_ALGO_STREEBOG_256] = "streebog256",
         [HASH_ALGO_STREEBOG_512] = "streebog512",
  };
diff --git a/crypto/testmgr.c b/crypto/testmgr.c

index 88f33c0efb2331a0922e10e8d017597a49ccc52c..ccb3d60729fc58c760dbd96b087aabe7bf2c1917 100644 (file)
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -4436,6 +4436,15 @@ static const struct alg_test_desc alg_test_descs[] = {
                         .cipher = __VECS(tf_cbc_tv_template)
                 },
         }, {
+#if IS_ENABLED(CONFIG_CRYPTO_PAES_S390)
+               .alg = "cbc-paes-s390",
+               .fips_allowed = 1,
+               .test = alg_test_skcipher,
+               .suite = {
+                       .cipher = __VECS(aes_cbc_tv_template)
+               }
+       }, {
+#endif
                 .alg = "cbcmac(aes)",
                 .fips_allowed = 1,
                 .test = alg_test_hash,
@@ -4587,6 +4596,15 @@ static const struct alg_test_desc alg_test_descs[] = {
                         .cipher = __VECS(tf_ctr_tv_template)
                 }
         }, {
+#if IS_ENABLED(CONFIG_CRYPTO_PAES_S390)
+               .alg = "ctr-paes-s390",
+               .fips_allowed = 1,
+               .test = alg_test_skcipher,
+               .suite = {
+                       .cipher = __VECS(aes_ctr_tv_template)
+               }
+       }, {
+#endif
                 .alg = "cts(cbc(aes))",
                 .test = alg_test_skcipher,
                 .fips_allowed = 1,
@@ -4879,6 +4897,15 @@ static const struct alg_test_desc alg_test_descs[] = {
                         .cipher = __VECS(xtea_tv_template)
                 }
         }, {
+#if IS_ENABLED(CONFIG_CRYPTO_PAES_S390)
+               .alg = "ecb-paes-s390",
+               .fips_allowed = 1,
+               .test = alg_test_skcipher,
+               .suite = {
+                       .cipher = __VECS(aes_tv_template)
+               }
+       }, {
+#endif
                 .alg = "ecdh",
                 .test = alg_test_kpp,
                 .fips_allowed = 1,
@@ -5465,6 +5492,15 @@ static const struct alg_test_desc alg_test_descs[] = {
                         .cipher = __VECS(tf_xts_tv_template)
                 }
         }, {
+#if IS_ENABLED(CONFIG_CRYPTO_PAES_S390)
+               .alg = "xts-paes-s390",
+               .fips_allowed = 1,
+               .test = alg_test_skcipher,
+               .suite = {
+                       .cipher = __VECS(aes_xts_tv_template)
+               }
+       }, {
+#endif
                 .alg = "xts4096(paes)",
                 .test = alg_test_null,
                 .fips_allowed = 1,
diff --git a/drivers/acpi/acpi_watchdog.c b/drivers/acpi/acpi_watchdog.c

index b5516b04ffc07b95a5869678f0886499fa9553bc..6e9ec6e3fe47d63ea9adf07e540ac2a223379a06 100644 (file)
--- a/drivers/acpi/acpi_watchdog.c
+++ b/drivers/acpi/acpi_watchdog.c
@@ -55,12 +55,14 @@ static bool acpi_watchdog_uses_rtc(const struct acpi_table_wdat *wdat)
  }
  #endif
  
+static bool acpi_no_watchdog;
+
  static const struct acpi_table_wdat *acpi_watchdog_get_wdat(void)
  {
         const struct acpi_table_wdat *wdat = NULL;
         acpi_status status;
  
-       if (acpi_disabled)
+       if (acpi_disabled || acpi_no_watchdog)
                 return NULL;
  
         status = acpi_get_table(ACPI_SIG_WDAT, 0,
@@ -88,6 +90,14 @@ bool acpi_has_watchdog(void)
  }
  EXPORT_SYMBOL_GPL(acpi_has_watchdog);
  
+/* ACPI watchdog can be disabled on boot command line */
+static int __init disable_acpi_watchdog(char *str)
+{
+       acpi_no_watchdog = true;
+       return 1;
+}
+__setup("acpi_no_watchdog", disable_acpi_watchdog);
+
  void __init acpi_watchdog_init(void)
  {
         const struct acpi_wdat_entry *entries;
@@ -126,12 +136,11 @@ void __init acpi_watchdog_init(void)
                 gas = &entries[i].register_region;
  
                 res.start = gas->address;
+               res.end = res.start + ACPI_ACCESS_BYTE_WIDTH(gas->access_width) - 1;
                 if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) {
                         res.flags = IORESOURCE_MEM;
-                       res.end = res.start + ALIGN(gas->access_width, 4) - 1;
                 } else if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_IO) {
                         res.flags = IORESOURCE_IO;
-                       res.end = res.start + gas->access_width - 1;
                 } else {
                         pr_warn("Unsupported address space: %u\n",
                                 gas->space_id);
diff --git a/drivers/acpi/acpica/achware.h b/drivers/acpi/acpica/achware.h

index 67f282e9e0af17500dde847bc8d4943a51e877be..6ad0517553d5e8f8386e4945cc562f31eb2dfd66 100644 (file)
--- a/drivers/acpi/acpica/achware.h
+++ b/drivers/acpi/acpica/achware.h
@@ -101,6 +101,8 @@ acpi_status acpi_hw_enable_all_runtime_gpes(void);
  
  acpi_status acpi_hw_enable_all_wakeup_gpes(void);
  
+u8 acpi_hw_check_all_gpes(void);
+
  acpi_status
  acpi_hw_enable_runtime_gpe_block(struct acpi_gpe_xrupt_info *gpe_xrupt_info,
                                  struct acpi_gpe_block_info *gpe_block,
diff --git a/drivers/acpi/acpica/evevent.c b/drivers/acpi/acpica/evevent.c

index 8c83d8c620dc3deb10dc9a5a61b75bd6b2b0251b..789d5e920aaf7084058512e09467b554a9b5120d 100644 (file)
--- a/drivers/acpi/acpica/evevent.c
+++ b/drivers/acpi/acpica/evevent.c
@@ -265,4 +265,49 @@ static u32 acpi_ev_fixed_event_dispatch(u32 event)
                  handler) (acpi_gbl_fixed_event_handlers[event].context));
  }
  
+/*******************************************************************************
+ *
+ * FUNCTION:    acpi_any_fixed_event_status_set
+ *
+ * PARAMETERS:  None
+ *
+ * RETURN:      TRUE or FALSE
+ *
+ * DESCRIPTION: Checks the PM status register for active fixed events
+ *
+ ******************************************************************************/
+
+u32 acpi_any_fixed_event_status_set(void)
+{
+       acpi_status status;
+       u32 in_status;
+       u32 in_enable;
+       u32 i;
+
+       status = acpi_hw_register_read(ACPI_REGISTER_PM1_ENABLE, &in_enable);
+       if (ACPI_FAILURE(status)) {
+               return (FALSE);
+       }
+
+       status = acpi_hw_register_read(ACPI_REGISTER_PM1_STATUS, &in_status);
+       if (ACPI_FAILURE(status)) {
+               return (FALSE);
+       }
+
+       /*
+        * Check for all possible Fixed Events and dispatch those that are active
+        */
+       for (i = 0; i < ACPI_NUM_FIXED_EVENTS; i++) {
+
+               /* Both the status and enable bits must be on for this event */
+
+               if ((in_status & acpi_gbl_fixed_event_info[i].status_bit_mask) &&
+                   (in_enable & acpi_gbl_fixed_event_info[i].enable_bit_mask)) {
+                       return (TRUE);
+               }
+       }
+
+       return (FALSE);
+}
+
  #endif                         /* !ACPI_REDUCED_HARDWARE */
diff --git a/drivers/acpi/acpica/evxfgpe.c b/drivers/acpi/acpica/evxfgpe.c

index 2c39ff2a7406900ea593f6d7f886253410444379..f2de66bfd8a7cd8aea8c91954bf337cf14d4fc63 100644 (file)
--- a/drivers/acpi/acpica/evxfgpe.c
+++ b/drivers/acpi/acpica/evxfgpe.c
@@ -795,6 +795,38 @@ acpi_status acpi_enable_all_wakeup_gpes(void)
  
  ACPI_EXPORT_SYMBOL(acpi_enable_all_wakeup_gpes)
  
+/******************************************************************************
+ *
+ * FUNCTION:    acpi_any_gpe_status_set
+ *
+ * PARAMETERS:  None
+ *
+ * RETURN:      Whether or not the status bit is set for any GPE
+ *
+ * DESCRIPTION: Check the status bits of all enabled GPEs and return TRUE if any
+ *              of them is set or FALSE otherwise.
+ *
+ ******************************************************************************/
+u32 acpi_any_gpe_status_set(void)
+{
+       acpi_status status;
+       u8 ret;
+
+       ACPI_FUNCTION_TRACE(acpi_any_gpe_status_set);
+
+       status = acpi_ut_acquire_mutex(ACPI_MTX_EVENTS);
+       if (ACPI_FAILURE(status)) {
+               return (FALSE);
+       }
+
+       ret = acpi_hw_check_all_gpes();
+       (void)acpi_ut_release_mutex(ACPI_MTX_EVENTS);
+
+       return (ret);
+}
+
+ACPI_EXPORT_SYMBOL(acpi_any_gpe_status_set)
+
  /*******************************************************************************
   *
   * FUNCTION:    acpi_install_gpe_block
diff --git a/drivers/acpi/acpica/hwgpe.c b/drivers/acpi/acpica/hwgpe.c

index 1b4252bdcd0b1a6346a1dd1d375c3405ca7266e8..f4c285c2f5956da43d45a31daaa5bc2d3c4d74fe 100644 (file)
--- a/drivers/acpi/acpica/hwgpe.c
+++ b/drivers/acpi/acpica/hwgpe.c
@@ -444,6 +444,53 @@ acpi_hw_enable_wakeup_gpe_block(struct acpi_gpe_xrupt_info *gpe_xrupt_info,
         return (AE_OK);
  }
  
+/******************************************************************************
+ *
+ * FUNCTION:    acpi_hw_get_gpe_block_status
+ *
+ * PARAMETERS:  gpe_xrupt_info      - GPE Interrupt info
+ *              gpe_block           - Gpe Block info
+ *
+ * RETURN:      Success
+ *
+ * DESCRIPTION: Produce a combined GPE status bits mask for the given block.
+ *
+ ******************************************************************************/
+
+static acpi_status
+acpi_hw_get_gpe_block_status(struct acpi_gpe_xrupt_info *gpe_xrupt_info,
+                            struct acpi_gpe_block_info *gpe_block,
+                            void *ret_ptr)
+{
+       struct acpi_gpe_register_info *gpe_register_info;
+       u64 in_enable, in_status;
+       acpi_status status;
+       u8 *ret = ret_ptr;
+       u32 i;
+
+       /* Examine each GPE Register within the block */
+
+       for (i = 0; i < gpe_block->register_count; i++) {
+               gpe_register_info = &gpe_block->register_info[i];
+
+               status = acpi_hw_read(&in_enable,
+                                     &gpe_register_info->enable_address);
+               if (ACPI_FAILURE(status)) {
+                       continue;
+               }
+
+               status = acpi_hw_read(&in_status,
+                                     &gpe_register_info->status_address);
+               if (ACPI_FAILURE(status)) {
+                       continue;
+               }
+
+               *ret |= in_enable & in_status;
+       }
+
+       return (AE_OK);
+}
+
  /******************************************************************************
   *
   * FUNCTION:    acpi_hw_disable_all_gpes
@@ -510,4 +557,28 @@ acpi_status acpi_hw_enable_all_wakeup_gpes(void)
         return_ACPI_STATUS(status);
  }
  
+/******************************************************************************
+ *
+ * FUNCTION:    acpi_hw_check_all_gpes
+ *
+ * PARAMETERS:  None
+ *
+ * RETURN:      Combined status of all GPEs
+ *
+ * DESCRIPTION: Check all enabled GPEs in all GPE blocks and return TRUE if the
+ *              status bit is set for at least one of them of FALSE otherwise.
+ *
+ ******************************************************************************/
+
+u8 acpi_hw_check_all_gpes(void)
+{
+       u8 ret = 0;
+
+       ACPI_FUNCTION_TRACE(acpi_hw_check_all_gpes);
+
+       (void)acpi_ev_walk_gpe_list(acpi_hw_get_gpe_block_status, &ret);
+
+       return (ret != 0);
+}
+
  #endif                         /* !ACPI_REDUCED_HARDWARE */
diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c

index 08bc9751fe6620f6e19f4356b7596c4a235feba6..d1f1cf5d4bf084e526ecb0f953f0097defd474d1 100644 (file)
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -179,6 +179,7 @@ EXPORT_SYMBOL(first_ec);
  
  static struct acpi_ec *boot_ec;
  static bool boot_ec_is_ecdt = false;
+static struct workqueue_struct *ec_wq;
  static struct workqueue_struct *ec_query_wq;
  
  static int EC_FLAGS_QUERY_HANDSHAKE; /* Needs QR_EC issued when SCI_EVT set */
@@ -469,7 +470,7 @@ static void acpi_ec_submit_query(struct acpi_ec *ec)
                 ec_dbg_evt("Command(%s) submitted/blocked",
                            acpi_ec_cmd_string(ACPI_EC_COMMAND_QUERY));
                 ec->nr_pending_queries++;
-               schedule_work(&ec->work);
+               queue_work(ec_wq, &ec->work);
         }
  }
  
@@ -535,7 +536,7 @@ static void acpi_ec_enable_event(struct acpi_ec *ec)
  #ifdef CONFIG_PM_SLEEP
  static void __acpi_ec_flush_work(void)
  {
-       flush_scheduled_work(); /* flush ec->work */
+       drain_workqueue(ec_wq); /* flush ec->work */
         flush_workqueue(ec_query_wq); /* flush queries */
  }
  
@@ -556,8 +557,8 @@ static void acpi_ec_disable_event(struct acpi_ec *ec)
  
  void acpi_ec_flush_work(void)
  {
-       /* Without ec_query_wq there is nothing to flush. */
-       if (!ec_query_wq)
+       /* Without ec_wq there is nothing to flush. */
+       if (!ec_wq)
                 return;
  
         __acpi_ec_flush_work();
@@ -2107,25 +2108,33 @@ static struct acpi_driver acpi_ec_driver = {
         .drv.pm = &acpi_ec_pm,
  };
  
-static inline int acpi_ec_query_init(void)
+static void acpi_ec_destroy_workqueues(void)
  {
-       if (!ec_query_wq) {
-               ec_query_wq = alloc_workqueue("kec_query", 0,
-                                             ec_max_queries);
-               if (!ec_query_wq)
-                       return -ENODEV;
+       if (ec_wq) {
+               destroy_workqueue(ec_wq);
+               ec_wq = NULL;
         }
-       return 0;
-}
-
-static inline void acpi_ec_query_exit(void)
-{
         if (ec_query_wq) {
                 destroy_workqueue(ec_query_wq);
                 ec_query_wq = NULL;
         }
  }
  
+static int acpi_ec_init_workqueues(void)
+{
+       if (!ec_wq)
+               ec_wq = alloc_ordered_workqueue("kec", 0);
+
+       if (!ec_query_wq)
+               ec_query_wq = alloc_workqueue("kec_query", 0, ec_max_queries);
+
+       if (!ec_wq || !ec_query_wq) {
+               acpi_ec_destroy_workqueues();
+               return -ENODEV;
+       }
+       return 0;
+}
+
  static const struct dmi_system_id acpi_ec_no_wakeup[] = {
         {
                 .ident = "Thinkpad X1 Carbon 6th",
@@ -2156,8 +2165,7 @@ int __init acpi_ec_init(void)
         int result;
         int ecdt_fail, dsdt_fail;
  
-       /* register workqueue for _Qxx evaluations */
-       result = acpi_ec_query_init();
+       result = acpi_ec_init_workqueues();
         if (result)
                 return result;
  
@@ -2188,6 +2196,6 @@ static void __exit acpi_ec_exit(void)
  {
  
         acpi_bus_unregister_driver(&acpi_ec_driver);
-       acpi_ec_query_exit();
+       acpi_ec_destroy_workqueues();
  }
  #endif /* 0 */
diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c

index 4398806298398a78a3160ea13d212a2627a3f9c3..e5f95922bc217e42342cce186bf0968c7e7f31dd 100644 (file)
--- a/drivers/acpi/sleep.c
+++ b/drivers/acpi/sleep.c
@@ -990,21 +990,41 @@ static void acpi_s2idle_sync(void)
         acpi_os_wait_events_complete(); /* synchronize Notify handling */
  }
  
-static void acpi_s2idle_wake(void)
+static bool acpi_s2idle_wake(void)
  {
-       /*
-        * If IRQD_WAKEUP_ARMED is set for the SCI at this point, the SCI has
-        * not triggered while suspended, so bail out.
-        */
-       if (!acpi_sci_irq_valid() ||
-           irqd_is_wakeup_armed(irq_get_irq_data(acpi_sci_irq)))
-               return;
+       if (!acpi_sci_irq_valid())
+               return pm_wakeup_pending();
+
+       while (pm_wakeup_pending()) {
+               /*
+                * If IRQD_WAKEUP_ARMED is set for the SCI at this point, the
+                * SCI has not triggered while suspended, so bail out (the
+                * wakeup is pending anyway and the SCI is not the source of
+                * it).
+                */
+               if (irqd_is_wakeup_armed(irq_get_irq_data(acpi_sci_irq)))
+                       return true;
+
+               /*
+                * If the status bit of any enabled fixed event is set, the
+                * wakeup is regarded as valid.
+                */
+               if (acpi_any_fixed_event_status_set())
+                       return true;
+
+               /*
+                * If there are no EC events to process and at least one of the
+                * other enabled GPEs is active, the wakeup is regarded as a
+                * genuine one.
+                *
+                * Note that the checks below must be carried out in this order
+                * to avoid returning prematurely due to a change of the EC GPE
+                * status bit from unset to set between the checks with the
+                * status bits of all the other GPEs unset.
+                */
+               if (acpi_any_gpe_status_set() && !acpi_ec_dispatch_gpe())
+                       return true;
  
-       /*
-        * If there are EC events to process, the wakeup may be a spurious one
-        * coming from the EC.
-        */
-       if (acpi_ec_dispatch_gpe()) {
                 /*
                  * Cancel the wakeup and process all pending events in case
                  * there are any wakeup ones in there.
@@ -1017,8 +1037,19 @@ static void acpi_s2idle_wake(void)
  
                 acpi_s2idle_sync();
  
+               /*
+                * The SCI is in the "suspended" state now and it cannot produce
+                * new wakeup events till the rearming below, so if any of them
+                * are pending here, they must be resulting from the processing
+                * of EC events above or coming from somewhere else.
+                */
+               if (pm_wakeup_pending())
+                       return true;
+
                 rearm_wake_irq(acpi_sci_irq);
         }
+
+       return false;
  }
  
  static void acpi_s2idle_restore_early(void)
diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c

index cd3612e4e2e14317ca506e24f00c4f9a6a2f9a73..8ef65c0856407cfbf780e370bf24a5141d8938f5 100644 (file)
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -853,14 +853,17 @@ static void reset_fdc_info(int mode)
  /* selects the fdc and drive, and enables the fdc's input/dma. */
  static void set_fdc(int drive)
  {
+       unsigned int new_fdc = fdc;
+
         if (drive >= 0 && drive < N_DRIVE) {
-               fdc = FDC(drive);
+               new_fdc = FDC(drive);
                 current_drive = drive;
         }
-       if (fdc != 1 && fdc != 0) {
+       if (new_fdc >= N_FDC) {
                 pr_info("bad fdc value\n");
                 return;
         }
+       fdc = new_fdc;
         set_dor(fdc, ~0, 8);
  #if N_FDC > 1
         set_dor(1 - fdc, ~8, 0);
diff --git a/drivers/block/null_blk.h b/drivers/block/null_blk.h

index bc837862b7679ac88b0e8ef15d33e61f388114f5..62b660821dbcc19b390712b4088fc7141a4b27c2 100644 (file)
--- a/drivers/block/null_blk.h
+++ b/drivers/block/null_blk.h
@@ -14,9 +14,6 @@
  #include <linux/fault-inject.h>
  
  struct nullb_cmd {
-       struct list_head list;
-       struct llist_node ll_list;
-       struct __call_single_data csd;
         struct request *rq;
         struct bio *bio;
         unsigned int tag;
diff --git a/drivers/block/null_blk_main.c b/drivers/block/null_blk_main.c

index 16510795e37720de6b4c233bbd9c0f4bcf798e05..133060431dbdb2f2c058876c75072571c9141c41 100644 (file)
--- a/drivers/block/null_blk_main.c
+++ b/drivers/block/null_blk_main.c
@@ -1518,8 +1518,6 @@ static int setup_commands(struct nullb_queue *nq)
  
         for (i = 0; i < nq->queue_depth; i++) {
                 cmd = &nq->cmds[i];
-               INIT_LIST_HEAD(&cmd->list);
-               cmd->ll_list.next = NULL;
                 cmd->tag = -1U;
         }
  
diff --git a/drivers/block/paride/pcd.c b/drivers/block/paride/pcd.c

index 117cfc8cd05a4c509876d137bf7fb3e2bd272382..cda5cf917e9afaee700470b00e26d8aaeb7fffef 100644 (file)
--- a/drivers/block/paride/pcd.c
+++ b/drivers/block/paride/pcd.c
@@ -276,7 +276,7 @@ static const struct block_device_operations pcd_bdops = {
         .release        = pcd_block_release,
         .ioctl          = pcd_block_ioctl,
  #ifdef CONFIG_COMPAT
-       .ioctl          = blkdev_compat_ptr_ioctl,
+       .compat_ioctl   = blkdev_compat_ptr_ioctl,
  #endif
         .check_events   = pcd_block_check_events,
  };
diff --git a/drivers/bus/moxtet.c b/drivers/bus/moxtet.c

index 15fa293819a03124355d6163b7b393de75f553f8..b20fdcbd035b21cd4625ea5b401fe4cf9181c70b 100644 (file)
--- a/drivers/bus/moxtet.c
+++ b/drivers/bus/moxtet.c
@@ -465,7 +465,7 @@ static ssize_t input_read(struct file *file, char __user *buf, size_t len,
  {
         struct moxtet *moxtet = file->private_data;
         u8 bin[TURRIS_MOX_MAX_MODULES];
-       u8 hex[sizeof(buf) * 2 + 1];
+       u8 hex[sizeof(bin) * 2 + 1];
         int ret, n;
  
         ret = moxtet_spi_read(moxtet, bin);
diff --git a/drivers/cdrom/gdrom.c b/drivers/cdrom/gdrom.c

index 886b2638c730308a6dce2061b18a06b04845fa18..c51292c2a131e0133872258e11470b6c5c58b73a 100644 (file)
--- a/drivers/cdrom/gdrom.c
+++ b/drivers/cdrom/gdrom.c
@@ -519,7 +519,7 @@ static const struct block_device_operations gdrom_bdops = {
         .check_events           = gdrom_bdops_check_events,
         .ioctl                  = gdrom_bdops_ioctl,
  #ifdef CONFIG_COMPAT
-       .ioctl                  = blkdev_compat_ptr_ioctl,
+       .compat_ioctl           = blkdev_compat_ptr_ioctl,
  #endif
  };
  
diff --git a/drivers/char/ipmi/ipmb_dev_int.c b/drivers/char/ipmi/ipmb_dev_int.c

index 1ff4fb1def7ca7f412a68c25f224787e8f00855c..382b28f1cf2f6daf922236e827a2abce1bc73940 100644 (file)
--- a/drivers/char/ipmi/ipmb_dev_int.c
+++ b/drivers/char/ipmi/ipmb_dev_int.c
@@ -19,7 +19,7 @@
  #include <linux/spinlock.h>
  #include <linux/wait.h>
  
-#define MAX_MSG_LEN            128
+#define MAX_MSG_LEN            240
  #define IPMB_REQUEST_LEN_MIN   7
  #define NETFN_RSP_BIT_MASK     0x4
  #define REQUEST_QUEUE_MAX_LEN  256
@@ -63,6 +63,7 @@ struct ipmb_dev {
         spinlock_t lock;
         wait_queue_head_t wait_queue;
         struct mutex file_mutex;
+       bool is_i2c_protocol;
  };
  
  static inline struct ipmb_dev *to_ipmb_dev(struct file *file)
@@ -112,6 +113,25 @@ static ssize_t ipmb_read(struct file *file, char __user *buf, size_t count,
         return ret < 0 ? ret : count;
  }
  
+static int ipmb_i2c_write(struct i2c_client *client, u8 *msg, u8 addr)
+{
+       struct i2c_msg i2c_msg;
+
+       /*
+        * subtract 1 byte (rq_sa) from the length of the msg passed to
+        * raw i2c_transfer
+        */
+       i2c_msg.len = msg[IPMB_MSG_LEN_IDX] - 1;
+
+       /* Assign message to buffer except first 2 bytes (length and address) */
+       i2c_msg.buf = msg + 2;
+
+       i2c_msg.addr = addr;
+       i2c_msg.flags = client->flags & I2C_CLIENT_PEC;
+
+       return i2c_transfer(client->adapter, &i2c_msg, 1);
+}
+
  static ssize_t ipmb_write(struct file *file, const char __user *buf,
                         size_t count, loff_t *ppos)
  {
@@ -133,6 +153,12 @@ static ssize_t ipmb_write(struct file *file, const char __user *buf,
         rq_sa = GET_7BIT_ADDR(msg[RQ_SA_8BIT_IDX]);
         netf_rq_lun = msg[NETFN_LUN_IDX];
  
+       /* Check i2c block transfer vs smbus */
+       if (ipmb_dev->is_i2c_protocol) {
+               ret = ipmb_i2c_write(ipmb_dev->client, msg, rq_sa);
+               return (ret == 1) ? count : ret;
+       }
+
         /*
          * subtract rq_sa and netf_rq_lun from the length of the msg passed to
          * i2c_smbus_xfer
@@ -253,7 +279,7 @@ static int ipmb_slave_cb(struct i2c_client *client,
                 break;
  
         case I2C_SLAVE_WRITE_RECEIVED:
-               if (ipmb_dev->msg_idx >= sizeof(struct ipmb_msg))
+               if (ipmb_dev->msg_idx >= sizeof(struct ipmb_msg) - 1)
                         break;
  
                 buf[++ipmb_dev->msg_idx] = *val;
@@ -302,6 +328,9 @@ static int ipmb_probe(struct i2c_client *client,
         if (ret)
                 return ret;
  
+       ipmb_dev->is_i2c_protocol
+               = device_property_read_bool(&client->dev, "i2c-protocol");
+
         ipmb_dev->client = client;
         i2c_set_clientdata(client, ipmb_dev);
         ret = i2c_slave_register(client, ipmb_slave_cb);
diff --git a/drivers/char/ipmi/ipmi_ssif.c b/drivers/char/ipmi/ipmi_ssif.c

index 22c6a2e612360aeae7a9d8fadbc2f43790903186..8ac390c2b51475b5dc79559c931dbd0b7f777571 100644 (file)
--- a/drivers/char/ipmi/ipmi_ssif.c
+++ b/drivers/char/ipmi/ipmi_ssif.c
@@ -775,10 +775,14 @@ static void msg_done_handler(struct ssif_info *ssif_info, int result,
         flags = ipmi_ssif_lock_cond(ssif_info, &oflags);
         msg = ssif_info->curr_msg;
         if (msg) {
+               if (data) {
+                       if (len > IPMI_MAX_MSG_LENGTH)
+                               len = IPMI_MAX_MSG_LENGTH;
+                       memcpy(msg->rsp, data, len);
+               } else {
+                       len = 0;
+               }
                 msg->rsp_size = len;
-               if (msg->rsp_size > IPMI_MAX_MSG_LENGTH)
-                       msg->rsp_size = IPMI_MAX_MSG_LENGTH;
-               memcpy(msg->rsp, data, msg->rsp_size);
                 ssif_info->curr_msg = NULL;
         }
  
diff --git a/drivers/char/tpm/Makefile b/drivers/char/tpm/Makefile

index 5a0d99d4fec0b1870e9f7b63e5e9e4444aa6950c..9567e5197f740f3c3b3ffeadc0495c928cd94399 100644 (file)
--- a/drivers/char/tpm/Makefile
+++ b/drivers/char/tpm/Makefile
@@ -21,9 +21,11 @@ tpm-$(CONFIG_EFI) += eventlog/efi.o
  tpm-$(CONFIG_OF) += eventlog/of.o
  obj-$(CONFIG_TCG_TIS_CORE) += tpm_tis_core.o
  obj-$(CONFIG_TCG_TIS) += tpm_tis.o
-obj-$(CONFIG_TCG_TIS_SPI) += tpm_tis_spi_mod.o
-tpm_tis_spi_mod-y := tpm_tis_spi.o
-tpm_tis_spi_mod-$(CONFIG_TCG_TIS_SPI_CR50) += tpm_tis_spi_cr50.o
+
+obj-$(CONFIG_TCG_TIS_SPI) += tpm_tis_spi.o
+tpm_tis_spi-y := tpm_tis_spi_main.o
+tpm_tis_spi-$(CONFIG_TCG_TIS_SPI_CR50) += tpm_tis_spi_cr50.o
+
  obj-$(CONFIG_TCG_TIS_I2C_ATMEL) += tpm_i2c_atmel.o
  obj-$(CONFIG_TCG_TIS_I2C_INFINEON) += tpm_i2c_infineon.o
  obj-$(CONFIG_TCG_TIS_I2C_NUVOTON) += tpm_i2c_nuvoton.o
diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c

index 13696deceae8e7fb73862ea99d731a58fe647f67..760329598b9960855f42119ed01098acca312057 100644 (file)
--- a/drivers/char/tpm/tpm2-cmd.c
+++ b/drivers/char/tpm/tpm2-cmd.c
@@ -525,6 +525,8 @@ static int tpm2_init_bank_info(struct tpm_chip *chip, u32 bank_index)
                 return 0;
         }
  
+       bank->crypto_id = HASH_ALGO__LAST;
+
         return tpm2_pcr_read(chip, 0, &digest, &bank->digest_size);
  }
  
diff --git a/drivers/char/tpm/tpm_tis_spi.c b/drivers/char/tpm/tpm_tis_spi_main.c

similarity index 100%

rename from drivers/char/tpm/tpm_tis_spi.c

rename to drivers/char/tpm/tpm_tis_spi_main.c
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c

index 4adac3a8c2656b43d3a19d34e63bd486c885038c..808874bccf4ace34791e520b8d5bde8854d1dbb4 100644 (file)
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -105,6 +105,8 @@ bool have_governor_per_policy(void)
  }
  EXPORT_SYMBOL_GPL(have_governor_per_policy);
  
+static struct kobject *cpufreq_global_kobject;
+
  struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy)
  {
         if (have_governor_per_policy())
@@ -1074,9 +1076,17 @@ static int cpufreq_init_policy(struct cpufreq_policy *policy)
                         pol = policy->last_policy;
                 } else if (def_gov) {
                         pol = cpufreq_parse_policy(def_gov->name);
-               } else {
-                       return -ENODATA;
+                       /*
+                        * In case the default governor is neiter "performance"
+                        * nor "powersave", fall back to the initial policy
+                        * value set by the driver.
+                        */
+                       if (pol == CPUFREQ_POLICY_UNKNOWN)
+                               pol = policy->policy;
                 }
+               if (pol != CPUFREQ_POLICY_PERFORMANCE &&
+                   pol != CPUFREQ_POLICY_POWERSAVE)
+                       return -ENODATA;
         }
  
         return cpufreq_set_policy(policy, gov, pol);
@@ -2745,9 +2755,6 @@ int cpufreq_unregister_driver(struct cpufreq_driver *driver)
  }
  EXPORT_SYMBOL_GPL(cpufreq_unregister_driver);
  
-struct kobject *cpufreq_global_kobject;
-EXPORT_SYMBOL(cpufreq_global_kobject);
-
  static int __init cpufreq_core_init(void)
  {
         if (cpufreq_disabled())
diff --git a/drivers/dax/super.c b/drivers/dax/super.c

index 26a654dbc69a26d2b0dfb46ac0ba57cc0fb6c9a9..0aa4b6bc5101dcdcd2de08e4635c79a780335e9f 100644 (file)
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -61,7 +61,7 @@ struct dax_device *fs_dax_get_by_bdev(struct block_device *bdev)
  {
         if (!blk_queue_dax(bdev->bd_queue))
                 return NULL;
-       return fs_dax_get_by_host(bdev->bd_disk->disk_name);
+       return dax_get_by_host(bdev->bd_disk->disk_name);
  }
  EXPORT_SYMBOL_GPL(fs_dax_get_by_bdev);
  #endif
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c

index cceee8bc3c2f745a02ab7b648d29458cbeaf2283..7dcf2093e5316a229dd9396feb73011a3f4f5fe3 100644 (file)
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -738,7 +738,6 @@ struct devfreq *devfreq_add_device(struct device *dev,
  {
         struct devfreq *devfreq;
         struct devfreq_governor *governor;
-       static atomic_t devfreq_no = ATOMIC_INIT(-1);
         int err = 0;
  
         if (!dev || !profile || !governor_name) {
@@ -800,8 +799,7 @@ struct devfreq *devfreq_add_device(struct device *dev,
         devfreq->suspend_freq = dev_pm_opp_get_suspend_opp_freq(dev);
         atomic_set(&devfreq->suspend_count, 0);
  
-       dev_set_name(&devfreq->dev, "devfreq%d",
-                               atomic_inc_return(&devfreq_no));
+       dev_set_name(&devfreq->dev, "%s", dev_name(dev));
         err = device_register(&devfreq->dev);
         if (err) {
                 mutex_unlock(&devfreq->lock);
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c

index 7243b88f81d889cd64b63d3ffc7991d05ab5e1a2..69e0d90460e6c4463046f29c76d1828d6dc3dde6 100644 (file)
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -505,16 +505,10 @@ void edac_mc_free(struct mem_ctl_info *mci)
  {
         edac_dbg(1, "\n");
  
-       /* If we're not yet registered with sysfs free only what was allocated
-        * in edac_mc_alloc().
-        */
-       if (!device_is_registered(&mci->dev)) {
-               _edac_mc_free(mci);
-               return;
-       }
+       if (device_is_registered(&mci->dev))
+               edac_unregister_sysfs(mci);
  
-       /* the mci instance is freed here, when the sysfs object is dropped */
-       edac_unregister_sysfs(mci);
+       _edac_mc_free(mci);
  }
  EXPORT_SYMBOL_GPL(edac_mc_free);
  
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c

index 0367554e74374a4fbe11216985d1ec5733a04f32..c70ec0a306d8d47b312475894bc3b679780c3725 100644 (file)
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -276,10 +276,7 @@ static const struct attribute_group *csrow_attr_groups[] = {
  
  static void csrow_attr_release(struct device *dev)
  {
-       struct csrow_info *csrow = container_of(dev, struct csrow_info, dev);
-
-       edac_dbg(1, "device %s released\n", dev_name(dev));
-       kfree(csrow);
+       /* release device with _edac_mc_free() */
  }
  
  static const struct device_type csrow_attr_type = {
@@ -447,8 +444,7 @@ error:
                 csrow = mci->csrows[i];
                 if (!nr_pages_per_csrow(csrow))
                         continue;
-
-               device_del(&mci->csrows[i]->dev);
+               device_unregister(&mci->csrows[i]->dev);
         }
  
         return err;
@@ -608,10 +604,7 @@ static const struct attribute_group *dimm_attr_groups[] = {
  
  static void dimm_attr_release(struct device *dev)
  {
-       struct dimm_info *dimm = container_of(dev, struct dimm_info, dev);
-
-       edac_dbg(1, "device %s released\n", dev_name(dev));
-       kfree(dimm);
+       /* release device with _edac_mc_free() */
  }
  
  static const struct device_type dimm_attr_type = {
@@ -893,10 +886,7 @@ static const struct attribute_group *mci_attr_groups[] = {
  
  static void mci_attr_release(struct device *dev)
  {
-       struct mem_ctl_info *mci = container_of(dev, struct mem_ctl_info, dev);
-
-       edac_dbg(1, "device %s released\n", dev_name(dev));
-       kfree(mci);
+       /* release device with _edac_mc_free() */
  }
  
  static const struct device_type mci_attr_type = {
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c

index 621220ab3d0e348fdff0eda1af30f499ed37d1ce..21ea99f651134be47161831266f00cb43fb85f56 100644 (file)
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -552,7 +552,7 @@ int __init efi_config_parse_tables(void *config_tables, int count, int sz,
  
                 seed = early_memremap(efi.rng_seed, sizeof(*seed));
                 if (seed != NULL) {
-                       size = seed->size;
+                       size = READ_ONCE(seed->size);
                         early_memunmap(seed, sizeof(*seed));
                 } else {
                         pr_err("Could not map UEFI random seed!\n");
@@ -562,7 +562,7 @@ int __init efi_config_parse_tables(void *config_tables, int count, int sz,
                                               sizeof(*seed) + size);
                         if (seed != NULL) {
                                 pr_notice("seeding entropy pool\n");
-                               add_bootloader_randomness(seed->bits, seed->size);
+                               add_bootloader_randomness(seed->bits, size);
                                 early_memunmap(seed, sizeof(*seed) + size);
                         } else {
                                 pr_err("Could not map UEFI random seed!\n");
diff --git a/drivers/fsi/Kconfig b/drivers/fsi/Kconfig

index 92ce6d85802cc0602146bd23f6a276245f9d2db1..4cc0e630ab79b0be31ff97962a6a9c45507399f7 100644 (file)
--- a/drivers/fsi/Kconfig
+++ b/drivers/fsi/Kconfig
@@ -55,6 +55,7 @@ config FSI_MASTER_AST_CF
  
  config FSI_MASTER_ASPEED
         tristate "FSI ASPEED master"
+       depends on HAS_IOMEM
         help
          This option enables a FSI master that is present behind an OPB bridge
          in the AST2600.
diff --git a/drivers/gpio/gpio-bd71828.c b/drivers/gpio/gpio-bd71828.c

index 04aade9e0a4d4484f7b0218cea2df963abeb7dec..3dbbc638e9a911bfa8119e8b615736ef243d7d80 100644 (file)
--- a/drivers/gpio/gpio-bd71828.c
+++ b/drivers/gpio/gpio-bd71828.c
@@ -10,16 +10,6 @@
  #define GPIO_OUT_REG(off) (BD71828_REG_GPIO_CTRL1 + (off))
  #define HALL_GPIO_OFFSET 3
  
-/*
- * These defines can be removed when
- * "gpio: Add definition for GPIO direction"
- * (9208b1e77d6e8e9776f34f46ef4079ecac9c3c25 in GPIO tree) gets merged,
- */
-#ifndef GPIO_LINE_DIRECTION_IN
-       #define GPIO_LINE_DIRECTION_IN 1
-       #define GPIO_LINE_DIRECTION_OUT 0
-#endif
-
  struct bd71828_gpio {
         struct rohm_regmap_dev chip;
         struct gpio_chip gpio;
diff --git a/drivers/gpio/gpio-sifive.c b/drivers/gpio/gpio-sifive.c

index 147a1bd0451521bf5e9c78b6d7d4cdd968d6559c..c54dd08f2cbfd3c75d74ffec723f4e0e5f72e4f7 100644 (file)
--- a/drivers/gpio/gpio-sifive.c
+++ b/drivers/gpio/gpio-sifive.c
@@ -35,7 +35,7 @@ struct sifive_gpio {
         void __iomem            *base;
         struct gpio_chip        gc;
         struct regmap           *regs;
-       u32                     irq_state;
+       unsigned long           irq_state;
         unsigned int            trigger[SIFIVE_GPIO_MAX];
         unsigned int            irq_parent[SIFIVE_GPIO_MAX];
  };
@@ -94,7 +94,7 @@ static void sifive_gpio_irq_enable(struct irq_data *d)
         spin_unlock_irqrestore(&gc->bgpio_lock, flags);
  
         /* Enable interrupts */
-       assign_bit(offset, (unsigned long *)&chip->irq_state, 1);
+       assign_bit(offset, &chip->irq_state, 1);
         sifive_gpio_set_ie(chip, offset);
  }
  
@@ -104,7 +104,7 @@ static void sifive_gpio_irq_disable(struct irq_data *d)
         struct sifive_gpio *chip = gpiochip_get_data(gc);
         int offset = irqd_to_hwirq(d) % SIFIVE_GPIO_MAX;
  
-       assign_bit(offset, (unsigned long *)&chip->irq_state, 0);
+       assign_bit(offset, &chip->irq_state, 0);
         sifive_gpio_set_ie(chip, offset);
         irq_chip_disable_parent(d);
  }
diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c

index a9748b5198e634f6a0938cc8bdff61f14a120cbc..67f9f82e0db0ef49a59f073bdb52b982411f29ac 100644 (file)
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -147,9 +147,10 @@ static void xgpio_set_multiple(struct gpio_chip *gc, unsigned long *mask,
         for (i = 0; i < gc->ngpio; i++) {
                 if (*mask == 0)
                         break;
+               /* Once finished with an index write it out to the register */
                 if (index !=  xgpio_index(chip, i)) {
                         xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-                                      xgpio_regoffset(chip, i),
+                                      index * XGPIO_CHANNEL_OFFSET,
                                        chip->gpio_state[index]);
                         spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
                         index =  xgpio_index(chip, i);
@@ -165,7 +166,7 @@ static void xgpio_set_multiple(struct gpio_chip *gc, unsigned long *mask,
         }
  
         xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-                      xgpio_regoffset(chip, i), chip->gpio_state[index]);
+                      index * XGPIO_CHANNEL_OFFSET, chip->gpio_state[index]);
  
         spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
  }
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c

index 753283486037435f33351e99c67c13e13fe04103..4d0106ceeba7bb24d3a21177356cacf6a82f69eb 100644 (file)
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -3035,13 +3035,33 @@ EXPORT_SYMBOL_GPL(gpiochip_free_own_desc);
   * rely on gpio_request() having been called beforehand.
   */
  
-static int gpio_set_config(struct gpio_chip *gc, unsigned int offset,
-                          enum pin_config_param mode)
+static int gpio_do_set_config(struct gpio_chip *gc, unsigned int offset,
+                             unsigned long config)
  {
         if (!gc->set_config)
                 return -ENOTSUPP;
  
-       return gc->set_config(gc, offset, mode);
+       return gc->set_config(gc, offset, config);
+}
+
+static int gpio_set_config(struct gpio_chip *gc, unsigned int offset,
+                          enum pin_config_param mode)
+{
+       unsigned long config;
+       unsigned arg;
+
+       switch (mode) {
+       case PIN_CONFIG_BIAS_PULL_DOWN:
+       case PIN_CONFIG_BIAS_PULL_UP:
+               arg = 1;
+               break;
+
+       default:
+               arg = 0;
+       }
+
+       config = PIN_CONF_PACKED(mode, arg);
+       return gpio_do_set_config(gc, offset, config);
  }
  
  static int gpio_set_bias(struct gpio_chip *chip, struct gpio_desc *desc)
@@ -3277,7 +3297,7 @@ int gpiod_set_debounce(struct gpio_desc *desc, unsigned debounce)
         chip = desc->gdev->chip;
  
         config = pinconf_to_config_packed(PIN_CONFIG_INPUT_DEBOUNCE, debounce);
-       return gpio_set_config(chip, gpio_chip_hwgpio(desc), config);
+       return gpio_do_set_config(chip, gpio_chip_hwgpio(desc), config);
  }
  EXPORT_SYMBOL_GPL(gpiod_set_debounce);
  
@@ -3311,7 +3331,7 @@ int gpiod_set_transitory(struct gpio_desc *desc, bool transitory)
         packed = pinconf_to_config_packed(PIN_CONFIG_PERSIST_STATE,
                                           !transitory);
         gpio = gpio_chip_hwgpio(desc);
-       rc = gpio_set_config(chip, gpio, packed);
+       rc = gpio_do_set_config(chip, gpio, packed);
         if (rc == -ENOTSUPP) {
                 dev_dbg(&desc->gdev->dev, "Persistence not supported for GPIO %d\n",
                                 gpio);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

index 94e2fd758e0130ab0185820e1905917e68c439d0..42f4febe24c6db0d5642b23acd9aa46dd9ab8db5 100644 (file)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1389,7 +1389,7 @@ amdgpu_get_crtc_scanout_position(struct drm_device *dev, unsigned int pipe,
  
  static struct drm_driver kms_driver = {
         .driver_features =
-           DRIVER_USE_AGP | DRIVER_ATOMIC |
+           DRIVER_ATOMIC |
             DRIVER_GEM |
             DRIVER_RENDER | DRIVER_MODESET | DRIVER_SYNCOBJ |
             DRIVER_SYNCOBJ_TIMELINE,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h

index d3c27a3c43f68ad5ed11ff5b0379d8d06c648d13..7546da0cc70c7019c94f58fb0ee66debcdc93a22 100644 (file)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -195,6 +195,7 @@ struct amdgpu_gmc {
         uint32_t                srbm_soft_reset;
         bool                    prt_warning;
         uint64_t                stolen_size;
+       uint32_t                sdpif_register;
         /* apertures */
         u64                     shared_aperture_start;
         u64                     shared_aperture_end;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c

index 07914e34bc2570b356c8b751fb29f8f120e04e69..1311d6aec5d4b3c17e5b391659c02f1fd762d41b 100644 (file)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c
@@ -52,7 +52,7 @@ static int amdgpu_perf_event_init(struct perf_event *event)
                 return -ENOENT;
  
         /* update the hw_perf_event struct with config data */
-       hwc->conf = event->attr.config;
+       hwc->config = event->attr.config;
  
         return 0;
  }
@@ -74,9 +74,9 @@ static void amdgpu_perf_start(struct perf_event *event, int flags)
         switch (pe->pmu_perf_type) {
         case PERF_TYPE_AMDGPU_DF:
                 if (!(flags & PERF_EF_RELOAD))
-                       pe->adev->df.funcs->pmc_start(pe->adev, hwc->conf, 1);
+                       pe->adev->df.funcs->pmc_start(pe->adev, hwc->config, 1);
  
-               pe->adev->df.funcs->pmc_start(pe->adev, hwc->conf, 0);
+               pe->adev->df.funcs->pmc_start(pe->adev, hwc->config, 0);
                 break;
         default:
                 break;
@@ -101,7 +101,7 @@ static void amdgpu_perf_read(struct perf_event *event)
  
                 switch (pe->pmu_perf_type) {
                 case PERF_TYPE_AMDGPU_DF:
-                       pe->adev->df.funcs->pmc_get_count(pe->adev, hwc->conf,
+                       pe->adev->df.funcs->pmc_get_count(pe->adev, hwc->config,
                                                           &count);
                         break;
                 default:
@@ -126,7 +126,7 @@ static void amdgpu_perf_stop(struct perf_event *event, int flags)
  
         switch (pe->pmu_perf_type) {
         case PERF_TYPE_AMDGPU_DF:
-               pe->adev->df.funcs->pmc_stop(pe->adev, hwc->conf, 0);
+               pe->adev->df.funcs->pmc_stop(pe->adev, hwc->config, 0);
                 break;
         default:
                 break;
@@ -156,7 +156,8 @@ static int amdgpu_perf_add(struct perf_event *event, int flags)
  
         switch (pe->pmu_perf_type) {
         case PERF_TYPE_AMDGPU_DF:
-               retval = pe->adev->df.funcs->pmc_start(pe->adev, hwc->conf, 1);
+               retval = pe->adev->df.funcs->pmc_start(pe->adev,
+                                                      hwc->config, 1);
                 break;
         default:
                 return 0;
@@ -184,7 +185,7 @@ static void amdgpu_perf_del(struct perf_event *event, int flags)
  
         switch (pe->pmu_perf_type) {
         case PERF_TYPE_AMDGPU_DF:
-               pe->adev->df.funcs->pmc_stop(pe->adev, hwc->conf, 1);
+               pe->adev->df.funcs->pmc_stop(pe->adev, hwc->config, 1);
                 break;
         default:
                 break;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c

index 3a1570dafe3482ac93992c0e76e1777d185721d6..146f96661b6b5c8945ba985bc047e1ccbd71fe38 100644 (file)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -1013,6 +1013,30 @@ static int psp_dtm_initialize(struct psp_context *psp)
         return 0;
  }
  
+static int psp_dtm_unload(struct psp_context *psp)
+{
+       int ret;
+       struct psp_gfx_cmd_resp *cmd;
+
+       /*
+        * TODO: bypass the unloading in sriov for now
+        */
+       if (amdgpu_sriov_vf(psp->adev))
+               return 0;
+
+       cmd = kzalloc(sizeof(struct psp_gfx_cmd_resp), GFP_KERNEL);
+       if (!cmd)
+               return -ENOMEM;
+
+       psp_prep_ta_unload_cmd_buf(cmd, psp->dtm_context.session_id);
+
+       ret = psp_cmd_submit_buf(psp, NULL, cmd, psp->fence_buf_mc_addr);
+
+       kfree(cmd);
+
+       return ret;
+}
+
  int psp_dtm_invoke(struct psp_context *psp, uint32_t ta_cmd_id)
  {
         /*
@@ -1037,7 +1061,7 @@ static int psp_dtm_terminate(struct psp_context *psp)
         if (!psp->dtm_context.dtm_initialized)
                 return 0;
  
-       ret = psp_hdcp_unload(psp);
+       ret = psp_dtm_unload(psp);
         if (ret)
                 return ret;
  
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h

index d6deb0eb1e15a4a91f715b2cbeb20525894e3090..6fe057329de2b1237887dbbd3946cba653833806 100644 (file)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.h
@@ -179,6 +179,7 @@ struct amdgpu_vcn_inst {
         struct amdgpu_irq_src   irq;
         struct amdgpu_vcn_reg   external;
         struct amdgpu_bo        *dpg_sram_bo;
+       struct dpg_pause_state  pause_state;
         void                    *dpg_sram_cpu_addr;
         uint64_t                dpg_sram_gpu_addr;
         uint32_t                *dpg_sram_curr_addr;
@@ -190,8 +191,6 @@ struct amdgpu_vcn {
         const struct firmware   *fw;    /* VCN firmware */
         unsigned                num_enc_rings;
         enum amd_powergating_state cur_state;
-       struct dpg_pause_state pause_state;
-
         bool                    indirect_sram;
  
         uint8_t num_vcn_inst;
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c

index 1785fdad6ecbaa4631468e7202470708465c3d9c..22bbb36c768e2bbe5b49f4c0187368d223717559 100644 (file)
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -3923,11 +3923,13 @@ static uint64_t gfx_v10_0_get_gpu_clock_counter(struct amdgpu_device *adev)
  {
         uint64_t clock;
  
+       amdgpu_gfx_off_ctrl(adev, false);
         mutex_lock(&adev->gfx.gpu_clock_mutex);
         WREG32_SOC15(GC, 0, mmRLC_CAPTURE_GPU_CLOCK_COUNT, 1);
         clock = (uint64_t)RREG32_SOC15(GC, 0, mmRLC_GPU_CLOCK_COUNT_LSB) |
                 ((uint64_t)RREG32_SOC15(GC, 0, mmRLC_GPU_CLOCK_COUNT_MSB) << 32ULL);
         mutex_unlock(&adev->gfx.gpu_clock_mutex);
+       amdgpu_gfx_off_ctrl(adev, true);
         return clock;
  }
  
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c

index 90f64b8bc3586adc4f271876fb8428100818cc70..3afdbbd6aaad9254d0fa9ead8a10ebf74f023ec2 100644 (file)
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -1193,6 +1193,14 @@ static bool gfx_v9_0_should_disable_gfxoff(struct pci_dev *pdev)
         return false;
  }
  
+static bool is_raven_kicker(struct amdgpu_device *adev)
+{
+       if (adev->pm.fw_version >= 0x41e2b)
+               return true;
+       else
+               return false;
+}
+
  static void gfx_v9_0_check_if_need_gfxoff(struct amdgpu_device *adev)
  {
         if (gfx_v9_0_should_disable_gfxoff(adev->pdev))
@@ -1205,9 +1213,8 @@ static void gfx_v9_0_check_if_need_gfxoff(struct amdgpu_device *adev)
                 break;
         case CHIP_RAVEN:
                 if (!(adev->rev_id >= 0x8 || adev->pdev->device == 0x15d8) &&
-                   ((adev->gfx.rlc_fw_version != 106 &&
+                   ((!is_raven_kicker(adev) &&
                       adev->gfx.rlc_fw_version < 531) ||
-                    (adev->gfx.rlc_fw_version == 53815) ||
                      (adev->gfx.rlc_feature_version < 1) ||
                      !adev->gfx.rlc.is_rlc_v2_1))
                         adev->pm.pp_feature &= ~PP_GFXOFF_MASK;
@@ -3959,6 +3966,7 @@ static uint64_t gfx_v9_0_get_gpu_clock_counter(struct amdgpu_device *adev)
  {
         uint64_t clock;
  
+       amdgpu_gfx_off_ctrl(adev, false);
         mutex_lock(&adev->gfx.gpu_clock_mutex);
         if (adev->asic_type == CHIP_VEGA10 && amdgpu_sriov_runtime(adev)) {
                 uint32_t tmp, lsb, msb, i = 0;
@@ -3977,6 +3985,7 @@ static uint64_t gfx_v9_0_get_gpu_clock_counter(struct amdgpu_device *adev)
                         ((uint64_t)RREG32_SOC15(GC, 0, mmRLC_GPU_CLOCK_COUNT_MSB) << 32ULL);
         }
         mutex_unlock(&adev->gfx.gpu_clock_mutex);
+       amdgpu_gfx_off_ctrl(adev, true);
         return clock;
  }
  
@@ -4374,9 +4383,17 @@ static int gfx_v9_0_ecc_late_init(void *handle)
         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
         int r;
  
-       r = gfx_v9_0_do_edc_gds_workarounds(adev);
-       if (r)
-               return r;
+       /*
+        * Temp workaround to fix the issue that CP firmware fails to
+        * update read pointer when CPDMA is writing clearing operation
+        * to GDS in suspend/resume sequence on several cards. So just
+        * limit this operation in cold boot sequence.
+        */
+       if (!adev->in_suspend) {
+               r = gfx_v9_0_do_edc_gds_workarounds(adev);
+               if (r)
+                       return r;
+       }
  
         /* requires IBs so do in late init after IB pool is initialized */
         r = gfx_v9_0_do_edc_gpr_workarounds(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c

index 90216abf14a4c356732a7950284d835f19fbc9de..cc0c273a86f9298b19e60dbf4e401d9fe440017e 100644 (file)
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -1271,6 +1271,19 @@ static void gmc_v9_0_init_golden_registers(struct amdgpu_device *adev)
         }
  }
  
+/**
+ * gmc_v9_0_restore_registers - restores regs
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * This restores register values, saved at suspend.
+ */
+static void gmc_v9_0_restore_registers(struct amdgpu_device *adev)
+{
+       if (adev->asic_type == CHIP_RAVEN)
+               WREG32(mmDCHUBBUB_SDPIF_MMIO_CNTRL_0, adev->gmc.sdpif_register);
+}
+
  /**
   * gmc_v9_0_gart_enable - gart enable
   *
@@ -1376,6 +1389,20 @@ static int gmc_v9_0_hw_init(void *handle)
         return r;
  }
  
+/**
+ * gmc_v9_0_save_registers - saves regs
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * This saves potential register values that should be
+ * restored upon resume
+ */
+static void gmc_v9_0_save_registers(struct amdgpu_device *adev)
+{
+       if (adev->asic_type == CHIP_RAVEN)
+               adev->gmc.sdpif_register = RREG32(mmDCHUBBUB_SDPIF_MMIO_CNTRL_0);
+}
+
  /**
   * gmc_v9_0_gart_disable - gart disable
   *
@@ -1412,9 +1439,16 @@ static int gmc_v9_0_hw_fini(void *handle)
  
  static int gmc_v9_0_suspend(void *handle)
  {
+       int r;
         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
  
-       return gmc_v9_0_hw_fini(adev);
+       r = gmc_v9_0_hw_fini(adev);
+       if (r)
+               return r;
+
+       gmc_v9_0_save_registers(adev);
+
+       return 0;
  }
  
  static int gmc_v9_0_resume(void *handle)
@@ -1422,6 +1456,7 @@ static int gmc_v9_0_resume(void *handle)
         int r;
         struct amdgpu_device *adev = (struct amdgpu_device *)handle;
  
+       gmc_v9_0_restore_registers(adev);
         r = gmc_v9_0_hw_init(adev);
         if (r)
                 return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c

index 15f3424a1ff792b299f83c71c335461f08177d10..2b488dfb2f21cfe192b733ab70da9b41d8d05d03 100644 (file)
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -272,7 +272,12 @@ static u32 soc15_get_config_memsize(struct amdgpu_device *adev)
  
  static u32 soc15_get_xclk(struct amdgpu_device *adev)
  {
-       return adev->clock.spll.reference_freq;
+       u32 reference_clock = adev->clock.spll.reference_freq;
+
+       if (adev->asic_type == CHIP_RAVEN)
+               return reference_clock / 4;
+
+       return reference_clock;
  }
  
  
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c

index 1a24fadd30e2da76d1c2a9cd3760f50edfdeb1ef..71f61afdc6551d6c04f1d051ff7643fd51d6f2eb 100644 (file)
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c
@@ -1207,9 +1207,10 @@ static int vcn_v1_0_pause_dpg_mode(struct amdgpu_device *adev,
         struct amdgpu_ring *ring;
  
         /* pause/unpause if state is changed */
-       if (adev->vcn.pause_state.fw_based != new_state->fw_based) {
+       if (adev->vcn.inst[inst_idx].pause_state.fw_based != new_state->fw_based) {
                 DRM_DEBUG("dpg pause state changed %d:%d -> %d:%d",
-                       adev->vcn.pause_state.fw_based, adev->vcn.pause_state.jpeg,
+                       adev->vcn.inst[inst_idx].pause_state.fw_based,
+                       adev->vcn.inst[inst_idx].pause_state.jpeg,
                         new_state->fw_based, new_state->jpeg);
  
                 reg_data = RREG32_SOC15(UVD, 0, mmUVD_DPG_PAUSE) &
@@ -1258,13 +1259,14 @@ static int vcn_v1_0_pause_dpg_mode(struct amdgpu_device *adev,
                         reg_data &= ~UVD_DPG_PAUSE__NJ_PAUSE_DPG_REQ_MASK;
                         WREG32_SOC15(UVD, 0, mmUVD_DPG_PAUSE, reg_data);
                 }
-               adev->vcn.pause_state.fw_based = new_state->fw_based;
+               adev->vcn.inst[inst_idx].pause_state.fw_based = new_state->fw_based;
         }
  
         /* pause/unpause if state is changed */
-       if (adev->vcn.pause_state.jpeg != new_state->jpeg) {
+       if (adev->vcn.inst[inst_idx].pause_state.jpeg != new_state->jpeg) {
                 DRM_DEBUG("dpg pause state changed %d:%d -> %d:%d",
-                       adev->vcn.pause_state.fw_based, adev->vcn.pause_state.jpeg,
+                       adev->vcn.inst[inst_idx].pause_state.fw_based,
+                       adev->vcn.inst[inst_idx].pause_state.jpeg,
                         new_state->fw_based, new_state->jpeg);
  
                 reg_data = RREG32_SOC15(UVD, 0, mmUVD_DPG_PAUSE) &
@@ -1318,7 +1320,7 @@ static int vcn_v1_0_pause_dpg_mode(struct amdgpu_device *adev,
                         reg_data &= ~UVD_DPG_PAUSE__JPEG_PAUSE_DPG_REQ_MASK;
                         WREG32_SOC15(UVD, 0, mmUVD_DPG_PAUSE, reg_data);
                 }
-               adev->vcn.pause_state.jpeg = new_state->jpeg;
+               adev->vcn.inst[inst_idx].pause_state.jpeg = new_state->jpeg;
         }
  
         return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c

index 4f7216788f11341860260374bb955b8c4933557f..c387c81f869583290ff0ca2f5933fb6e2b0edada 100644 (file)
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c
@@ -1137,9 +1137,9 @@ static int vcn_v2_0_pause_dpg_mode(struct amdgpu_device *adev,
         int ret_code;
  
         /* pause/unpause if state is changed */
-       if (adev->vcn.pause_state.fw_based != new_state->fw_based) {
+       if (adev->vcn.inst[inst_idx].pause_state.fw_based != new_state->fw_based) {
                 DRM_DEBUG("dpg pause state changed %d -> %d",
-                       adev->vcn.pause_state.fw_based, new_state->fw_based);
+                       adev->vcn.inst[inst_idx].pause_state.fw_based,  new_state->fw_based);
                 reg_data = RREG32_SOC15(UVD, 0, mmUVD_DPG_PAUSE) &
                         (~UVD_DPG_PAUSE__NJ_PAUSE_DPG_ACK_MASK);
  
@@ -1185,7 +1185,7 @@ static int vcn_v2_0_pause_dpg_mode(struct amdgpu_device *adev,
                         reg_data &= ~UVD_DPG_PAUSE__NJ_PAUSE_DPG_REQ_MASK;
                         WREG32_SOC15(UVD, 0, mmUVD_DPG_PAUSE, reg_data);
                 }
-               adev->vcn.pause_state.fw_based = new_state->fw_based;
+               adev->vcn.inst[inst_idx].pause_state.fw_based = new_state->fw_based;
         }
  
         return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c

index 70fae7977f8f4695e4e7a3afc240402fc7569904..2d64ba1adf992a241a1e62548188e71aa328c345 100644 (file)
--- a/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
+++ b/drivers/gpu/drm/amd/amdgpu/vcn_v2_5.c
@@ -1367,9 +1367,9 @@ static int vcn_v2_5_pause_dpg_mode(struct amdgpu_device *adev,
         int ret_code;
  
         /* pause/unpause if state is changed */
-       if (adev->vcn.pause_state.fw_based != new_state->fw_based) {
+       if (adev->vcn.inst[inst_idx].pause_state.fw_based != new_state->fw_based) {
                 DRM_DEBUG("dpg pause state changed %d -> %d",
-                       adev->vcn.pause_state.fw_based, new_state->fw_based);
+                       adev->vcn.inst[inst_idx].pause_state.fw_based,  new_state->fw_based);
                 reg_data = RREG32_SOC15(UVD, inst_idx, mmUVD_DPG_PAUSE) &
                         (~UVD_DPG_PAUSE__NJ_PAUSE_DPG_ACK_MASK);
  
@@ -1407,14 +1407,14 @@ static int vcn_v2_5_pause_dpg_mode(struct amdgpu_device *adev,
                                            RREG32_SOC15(UVD, inst_idx, mmUVD_SCRATCH2) & 0x7FFFFFFF);
  
                                 SOC15_WAIT_ON_RREG(UVD, inst_idx, mmUVD_POWER_STATUS,
-                                          0x0, UVD_POWER_STATUS__UVD_POWER_STATUS_MASK, ret_code);
+                                          UVD_PGFSM_CONFIG__UVDM_UVDU_PWR_ON, UVD_POWER_STATUS__UVD_POWER_STATUS_MASK, ret_code);
                         }
                 } else {
                         /* unpause dpg, no need to wait */
                         reg_data &= ~UVD_DPG_PAUSE__NJ_PAUSE_DPG_REQ_MASK;
                         WREG32_SOC15(UVD, inst_idx, mmUVD_DPG_PAUSE, reg_data);
                 }
-               adev->vcn.pause_state.fw_based = new_state->fw_based;
+               adev->vcn.inst[inst_idx].pause_state.fw_based = new_state->fw_based;
         }
  
         return 0;
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

index 279541517a99a3a733722f72c9fff84fd8d13b2b..e8f66fbf399e54c130bfe61a3148a7bc254839fc 100644 (file)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c
@@ -1911,7 +1911,7 @@ static void handle_hpd_irq(void *param)
         mutex_lock(&aconnector->hpd_lock);
  
  #ifdef CONFIG_DRM_AMD_DC_HDCP
-       if (adev->asic_type >= CHIP_RAVEN)
+       if (adev->dm.hdcp_workqueue)
                 hdcp_reset_display(adev->dm.hdcp_workqueue, aconnector->dc_link->link_index);
  #endif
         if (aconnector->fake_enable)
@@ -2088,8 +2088,10 @@ static void handle_hpd_rx_irq(void *param)
                 }
         }
  #ifdef CONFIG_DRM_AMD_DC_HDCP
-       if (hpd_irq_data.bytes.device_service_irq.bits.CP_IRQ)
-               hdcp_handle_cpirq(adev->dm.hdcp_workqueue,  aconnector->base.index);
+           if (hpd_irq_data.bytes.device_service_irq.bits.CP_IRQ) {
+                   if (adev->dm.hdcp_workqueue)
+                           hdcp_handle_cpirq(adev->dm.hdcp_workqueue,  aconnector->base.index);
+           }
  #endif
         if ((dc_link->cur_link_settings.lane_count != LANE_COUNT_UNKNOWN) ||
             (dc_link->type == dc_connection_mst_branch))
@@ -5702,7 +5704,7 @@ void amdgpu_dm_connector_init_helper(struct amdgpu_display_manager *dm,
                 drm_connector_attach_vrr_capable_property(
                         &aconnector->base);
  #ifdef CONFIG_DRM_AMD_DC_HDCP
-               if (adev->asic_type >= CHIP_RAVEN)
+               if (adev->dm.hdcp_workqueue)
                         drm_connector_attach_content_protection_property(&aconnector->base, true);
  #endif
         }
@@ -8408,7 +8410,6 @@ bool amdgpu_dm_psr_enable(struct dc_stream_state *stream)
         /* Calculate number of static frames before generating interrupt to
          * enter PSR.
          */
-       unsigned int frame_time_microsec = 1000000 / vsync_rate_hz;
         // Init fail safe of 2 frames static
         unsigned int num_frames_static = 2;
  
@@ -8423,8 +8424,10 @@ bool amdgpu_dm_psr_enable(struct dc_stream_state *stream)
          * Calculate number of frames such that at least 30 ms of time has
          * passed.
          */
-       if (vsync_rate_hz != 0)
+       if (vsync_rate_hz != 0) {
+               unsigned int frame_time_microsec = 1000000 / vsync_rate_hz;
                 num_frames_static = (30000 / frame_time_microsec) + 1;
+       }
  
         params.triggers.cursor_update = true;
         params.triggers.overlay_update = true;
diff --git a/drivers/gpu/drm/amd/display/dc/bios/command_table2.c b/drivers/gpu/drm/amd/display/dc/bios/command_table2.c

index 629a07a2719b2db7e3c657c0ed07aa669e0b2b28..c4ba6e84db6511a81eda89531c1de7d000aa1bd1 100644 (file)
--- a/drivers/gpu/drm/amd/display/dc/bios/command_table2.c
+++ b/drivers/gpu/drm/amd/display/dc/bios/command_table2.c
@@ -711,10 +711,6 @@ static void enable_disp_power_gating_dmcub(
         power_gating.header.sub_type = DMUB_CMD__VBIOS_ENABLE_DISP_POWER_GATING;
         power_gating.power_gating.pwr = *pwr;
  
-       /* ATOM_ENABLE is old API in DMUB */
-       if (power_gating.power_gating.pwr.enable == ATOM_ENABLE)
-               power_gating.power_gating.pwr.enable = ATOM_INIT;
-
         dc_dmub_srv_cmd_queue(dmcub, &power_gating.header);
         dc_dmub_srv_cmd_execute(dmcub);
         dc_dmub_srv_wait_idle(dmcub);
diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile

index 3cd2831950919e82bad51ad38797c3b7fa930c7e..c0f6a8c7de7de82c9d455474cd39da58937e112d 100644 (file)
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/Makefile
@@ -87,6 +87,12 @@ AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN20)
  ###############################################################################
  CLK_MGR_DCN21 = rn_clk_mgr.o rn_clk_mgr_vbios_smu.o
  
+# prevent build errors regarding soft-float vs hard-float FP ABI tags
+# this code is currently unused on ppc64, as it applies to Renoir APUs only
+ifdef CONFIG_PPC64
+CFLAGS_$(AMDDALPATH)/dc/clk_mgr/dcn21/rn_clk_mgr.o := $(call cc-option,-mno-gnu-attribute)
+endif
+
  AMD_DAL_CLK_MGR_DCN21 = $(addprefix $(AMDDALPATH)/dc/clk_mgr/dcn21/,$(CLK_MGR_DCN21))
  
  AMD_DISPLAY_FILES += $(AMD_DAL_CLK_MGR_DCN21)
diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn20/dcn20_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn20/dcn20_clk_mgr.c

index 495f01e9f2cac4f3da8e243f4ca0f79dadfac495..49ce46b543eaf5227d0a407504e7ed1872832484 100644 (file)
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn20/dcn20_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn20/dcn20_clk_mgr.c
@@ -117,7 +117,7 @@ void dcn20_update_clocks_update_dpp_dto(struct clk_mgr_internal *clk_mgr,
  
                 prev_dppclk_khz = clk_mgr->base.ctx->dc->current_state->res_ctx.pipe_ctx[i].plane_res.bw.dppclk_khz;
  
-               if (safe_to_lower || prev_dppclk_khz < dppclk_khz) {
+               if ((prev_dppclk_khz > dppclk_khz && safe_to_lower) || prev_dppclk_khz < dppclk_khz) {
                         clk_mgr->dccg->funcs->update_dpp_dto(
                                                         clk_mgr->dccg, dpp_inst, dppclk_khz);
                 }
diff --git a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c

index 7ae4c06232dd2bf53fdb89a0c92f24e34d2fe9c2..9ef3f7b91a1d08aed184038dd6ff82e566bb6a6b 100644 (file)
--- a/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c
+++ b/drivers/gpu/drm/amd/display/dc/clk_mgr/dcn21/rn_clk_mgr.c
@@ -151,6 +151,12 @@ void rn_update_clocks(struct clk_mgr *clk_mgr_base,
                 rn_vbios_smu_set_min_deep_sleep_dcfclk(clk_mgr, clk_mgr_base->clks.dcfclk_deep_sleep_khz);
         }
  
+       // workaround: Limit dppclk to 100Mhz to avoid lower eDP panel switch to plus 4K monitor underflow.
+       if (!IS_DIAG_DC(dc->ctx->dce_environment)) {
+               if (new_clocks->dppclk_khz < 100000)
+                       new_clocks->dppclk_khz = 100000;
+       }
+
         if (should_set_clock(safe_to_lower, new_clocks->dppclk_khz, clk_mgr->base.clks.dppclk_khz)) {
                 if (clk_mgr->base.clks.dppclk_khz > new_clocks->dppclk_khz)
                         dpp_clock_lowered = true;
@@ -412,19 +418,19 @@ void build_watermark_ranges(struct clk_bw_params *bw_params, struct pp_smu_wm_ra
  
                 ranges->reader_wm_sets[num_valid_sets].wm_inst = bw_params->wm_table.entries[i].wm_inst;
                 ranges->reader_wm_sets[num_valid_sets].wm_type = bw_params->wm_table.entries[i].wm_type;
-               /* We will not select WM based on dcfclk, so leave it as unconstrained */
-               ranges->reader_wm_sets[num_valid_sets].min_drain_clk_mhz = PP_SMU_WM_SET_RANGE_CLK_UNCONSTRAINED_MIN;
-               ranges->reader_wm_sets[num_valid_sets].max_drain_clk_mhz = PP_SMU_WM_SET_RANGE_CLK_UNCONSTRAINED_MAX;
-               /* fclk wil be used to select WM*/
+               /* We will not select WM based on fclk, so leave it as unconstrained */
+               ranges->reader_wm_sets[num_valid_sets].min_fill_clk_mhz = PP_SMU_WM_SET_RANGE_CLK_UNCONSTRAINED_MIN;
+               ranges->reader_wm_sets[num_valid_sets].max_fill_clk_mhz = PP_SMU_WM_SET_RANGE_CLK_UNCONSTRAINED_MAX;
+               /* dcfclk wil be used to select WM*/
  
                 if (ranges->reader_wm_sets[num_valid_sets].wm_type == WM_TYPE_PSTATE_CHG) {
                         if (i == 0)
-                               ranges->reader_wm_sets[num_valid_sets].min_fill_clk_mhz = 0;
+                               ranges->reader_wm_sets[num_valid_sets].min_drain_clk_mhz = 0;
                         else {
                                 /* add 1 to make it non-overlapping with next lvl */
-                               ranges->reader_wm_sets[num_valid_sets].min_fill_clk_mhz = bw_params->clk_table.entries[i - 1].fclk_mhz + 1;
+                               ranges->reader_wm_sets[num_valid_sets].min_drain_clk_mhz = bw_params->clk_table.entries[i - 1].dcfclk_mhz + 1;
                         }
-                       ranges->reader_wm_sets[num_valid_sets].max_fill_clk_mhz = bw_params->clk_table.entries[i].fclk_mhz;
+                       ranges->reader_wm_sets[num_valid_sets].max_drain_clk_mhz = bw_params->clk_table.entries[i].dcfclk_mhz;
  
                 } else {
                         /* unconstrained for memory retraining */
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c b/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c

index f1a5d2c6aa37874577fa6e804cad2c5a43b8c6b8..68c4049cbc2adaedd986f91e8951b8ad491208f3 100644 (file)
--- a/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c
+++ b/drivers/gpu/drm/amd/display/dc/dce/dce_aux.c
@@ -400,7 +400,7 @@ static bool acquire(
  {
         enum gpio_result result;
  
-       if (!is_engine_available(engine))
+       if ((engine == NULL) || !is_engine_available(engine))
                 return false;
  
         result = dal_ddc_open(ddc, GPIO_MODE_HARDWARE,
diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c

index cfbbaffa865475933d069f647be2df3bb02b322a..a444fed94184919c51bded44ebb4eaafd3db117b 100644 (file)
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c
@@ -572,7 +572,6 @@ void dcn20_plane_atomic_disable(struct dc *dc, struct pipe_ctx *pipe_ctx)
         dpp->funcs->dpp_dppclk_control(dpp, false, false);
  
         hubp->power_gated = true;
-       dc->optimized_required = false; /* We're powering off, no need to optimize */
  
         hws->funcs.plane_atomic_power_down(dc,
                         pipe_ctx->plane_res.dpp,
diff --git a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c

index 0d506d30d6b6f455a197a053d693f38c060ec782..33d0a176841a57479db14b17c61d583c6e1161e5 100644 (file)
--- a/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
@@ -60,6 +60,7 @@
  #include "dcn20/dcn20_dccg.h"
  #include "dcn21_hubbub.h"
  #include "dcn10/dcn10_resource.h"
+#include "dce110/dce110_resource.h"
  
  #include "dcn20/dcn20_dwb.h"
  #include "dcn20/dcn20_mmhubbub.h"
@@ -856,6 +857,7 @@ static const struct dc_debug_options debug_defaults_diags = {
  enum dcn20_clk_src_array_id {
         DCN20_CLK_SRC_PLL0,
         DCN20_CLK_SRC_PLL1,
+       DCN20_CLK_SRC_PLL2,
         DCN20_CLK_SRC_TOTAL_DCN21
  };
  
@@ -1718,6 +1720,10 @@ static bool dcn21_resource_construct(
                         dcn21_clock_source_create(ctx, ctx->dc_bios,
                                 CLOCK_SOURCE_COMBO_PHY_PLL1,
                                 &clk_src_regs[1], false);
+       pool->base.clock_sources[DCN20_CLK_SRC_PLL2] =
+                       dcn21_clock_source_create(ctx, ctx->dc_bios,
+                               CLOCK_SOURCE_COMBO_PHY_PLL2,
+                               &clk_src_regs[2], false);
  
         pool->base.clk_src_count = DCN20_CLK_SRC_TOTAL_DCN21;
  
diff --git a/drivers/gpu/drm/amd/display/modules/hdcp/hdcp2_execution.c b/drivers/gpu/drm/amd/display/modules/hdcp/hdcp2_execution.c

index f730b94ac3c0633d584524296a1ca403028990b8..55246711700ba721f4ce02141fde6c1dae6cdefd 100644 (file)
--- a/drivers/gpu/drm/amd/display/modules/hdcp/hdcp2_execution.c
+++ b/drivers/gpu/drm/amd/display/modules/hdcp/hdcp2_execution.c
@@ -46,8 +46,8 @@ static inline enum mod_hdcp_status check_hdcp2_capable(struct mod_hdcp *hdcp)
         enum mod_hdcp_status status;
  
         if (is_dp_hdcp(hdcp))
-               status = (hdcp->auth.msg.hdcp2.rxcaps_dp[2] & HDCP_2_2_RX_CAPS_VERSION_VAL) &&
-                               HDCP_2_2_DP_HDCP_CAPABLE(hdcp->auth.msg.hdcp2.rxcaps_dp[0]) ?
+               status = (hdcp->auth.msg.hdcp2.rxcaps_dp[0] == HDCP_2_2_RX_CAPS_VERSION_VAL) &&
+                               HDCP_2_2_DP_HDCP_CAPABLE(hdcp->auth.msg.hdcp2.rxcaps_dp[2]) ?
                                 MOD_HDCP_STATUS_SUCCESS :
                                 MOD_HDCP_STATUS_HDCP2_NOT_CAPABLE;
         else
diff --git a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_12_0_offset.h b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_12_0_offset.h

index b6f74bf4af023fd53ee79ce44ecb29cf9f8cb43b..27bb8c1ab85876bed4ad23bfc6df7648eaa8a851 100644 (file)
--- a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_12_0_offset.h
+++ b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_12_0_offset.h
@@ -7376,6 +7376,8 @@
  #define mmCRTC4_CRTC_DRR_CONTROL                                                                       0x0f3e
  #define mmCRTC4_CRTC_DRR_CONTROL_BASE_IDX                                                              2
  
+#define mmDCHUBBUB_SDPIF_MMIO_CNTRL_0                                                                  0x395d
+#define mmDCHUBBUB_SDPIF_MMIO_CNTRL_0_BASE_IDX                                                         2
  
  // addressBlock: dce_dc_fmt4_dispdec
  // base address: 0x2000
diff --git a/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0_pptable.h b/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0_pptable.h

index b2f96a10112465f283add4d068ccab8fa630548a..7a63cf8e85ed9419fb96d10452a8280b164044d1 100644 (file)
--- a/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0_pptable.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/smu_v11_0_pptable.h
@@ -39,21 +39,39 @@
  #define SMU_11_0_PP_OVERDRIVE_VERSION                   0x0800
  #define SMU_11_0_PP_POWERSAVINGCLOCK_VERSION            0x0100
  
+enum SMU_11_0_ODFEATURE_CAP {
+    SMU_11_0_ODCAP_GFXCLK_LIMITS = 0,
+    SMU_11_0_ODCAP_GFXCLK_CURVE,
+    SMU_11_0_ODCAP_UCLK_MAX,
+    SMU_11_0_ODCAP_POWER_LIMIT,
+    SMU_11_0_ODCAP_FAN_ACOUSTIC_LIMIT,
+    SMU_11_0_ODCAP_FAN_SPEED_MIN,
+    SMU_11_0_ODCAP_TEMPERATURE_FAN,
+    SMU_11_0_ODCAP_TEMPERATURE_SYSTEM,
+    SMU_11_0_ODCAP_MEMORY_TIMING_TUNE,
+    SMU_11_0_ODCAP_FAN_ZERO_RPM_CONTROL,
+    SMU_11_0_ODCAP_AUTO_UV_ENGINE,
+    SMU_11_0_ODCAP_AUTO_OC_ENGINE,
+    SMU_11_0_ODCAP_AUTO_OC_MEMORY,
+    SMU_11_0_ODCAP_FAN_CURVE,
+    SMU_11_0_ODCAP_COUNT,
+};
+
  enum SMU_11_0_ODFEATURE_ID {
-    SMU_11_0_ODFEATURE_GFXCLK_LIMITS        = 1 << 0,         //GFXCLK Limit feature
-    SMU_11_0_ODFEATURE_GFXCLK_CURVE         = 1 << 1,         //GFXCLK Curve feature
-    SMU_11_0_ODFEATURE_UCLK_MAX             = 1 << 2,         //UCLK Limit feature
-    SMU_11_0_ODFEATURE_POWER_LIMIT          = 1 << 3,         //Power Limit feature
-    SMU_11_0_ODFEATURE_FAN_ACOUSTIC_LIMIT   = 1 << 4,         //Fan Acoustic RPM feature
-    SMU_11_0_ODFEATURE_FAN_SPEED_MIN        = 1 << 5,         //Minimum Fan Speed feature
-    SMU_11_0_ODFEATURE_TEMPERATURE_FAN      = 1 << 6,         //Fan Target Temperature Limit feature
-    SMU_11_0_ODFEATURE_TEMPERATURE_SYSTEM   = 1 << 7,         //Operating Temperature Limit feature
-    SMU_11_0_ODFEATURE_MEMORY_TIMING_TUNE   = 1 << 8,         //AC Timing Tuning feature
-    SMU_11_0_ODFEATURE_FAN_ZERO_RPM_CONTROL = 1 << 9,         //Zero RPM feature
-    SMU_11_0_ODFEATURE_AUTO_UV_ENGINE       = 1 << 10,        //Auto Under Volt GFXCLK feature
-    SMU_11_0_ODFEATURE_AUTO_OC_ENGINE       = 1 << 11,        //Auto Over Clock GFXCLK feature
-    SMU_11_0_ODFEATURE_AUTO_OC_MEMORY       = 1 << 12,        //Auto Over Clock MCLK feature
-    SMU_11_0_ODFEATURE_FAN_CURVE            = 1 << 13,        //VICTOR TODO
+    SMU_11_0_ODFEATURE_GFXCLK_LIMITS        = 1 << SMU_11_0_ODCAP_GFXCLK_LIMITS,            //GFXCLK Limit feature
+    SMU_11_0_ODFEATURE_GFXCLK_CURVE         = 1 << SMU_11_0_ODCAP_GFXCLK_CURVE,             //GFXCLK Curve feature
+    SMU_11_0_ODFEATURE_UCLK_MAX             = 1 << SMU_11_0_ODCAP_UCLK_MAX,                 //UCLK Limit feature
+    SMU_11_0_ODFEATURE_POWER_LIMIT          = 1 << SMU_11_0_ODCAP_POWER_LIMIT,              //Power Limit feature
+    SMU_11_0_ODFEATURE_FAN_ACOUSTIC_LIMIT   = 1 << SMU_11_0_ODCAP_FAN_ACOUSTIC_LIMIT,       //Fan Acoustic RPM feature
+    SMU_11_0_ODFEATURE_FAN_SPEED_MIN        = 1 << SMU_11_0_ODCAP_FAN_SPEED_MIN,            //Minimum Fan Speed feature
+    SMU_11_0_ODFEATURE_TEMPERATURE_FAN      = 1 << SMU_11_0_ODCAP_TEMPERATURE_FAN,          //Fan Target Temperature Limit feature
+    SMU_11_0_ODFEATURE_TEMPERATURE_SYSTEM   = 1 << SMU_11_0_ODCAP_TEMPERATURE_SYSTEM,       //Operating Temperature Limit feature
+    SMU_11_0_ODFEATURE_MEMORY_TIMING_TUNE   = 1 << SMU_11_0_ODCAP_MEMORY_TIMING_TUNE,       //AC Timing Tuning feature
+    SMU_11_0_ODFEATURE_FAN_ZERO_RPM_CONTROL = 1 << SMU_11_0_ODCAP_FAN_ZERO_RPM_CONTROL,     //Zero RPM feature
+    SMU_11_0_ODFEATURE_AUTO_UV_ENGINE       = 1 << SMU_11_0_ODCAP_AUTO_UV_ENGINE,           //Auto Under Volt GFXCLK feature
+    SMU_11_0_ODFEATURE_AUTO_OC_ENGINE       = 1 << SMU_11_0_ODCAP_AUTO_OC_ENGINE,           //Auto Over Clock GFXCLK feature
+    SMU_11_0_ODFEATURE_AUTO_OC_MEMORY       = 1 << SMU_11_0_ODCAP_AUTO_OC_MEMORY,           //Auto Over Clock MCLK feature
+    SMU_11_0_ODFEATURE_FAN_CURVE            = 1 << SMU_11_0_ODCAP_FAN_CURVE,                //Fan Curve feature
      SMU_11_0_ODFEATURE_COUNT                = 14,
  };
  #define SMU_11_0_MAX_ODFEATURE    32          //Maximum Number of OD Features
diff --git a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c

index 19a9846b730e1b5e05769b2d4b3a3283620c0189..0d73a49166af3f12c6491d2982a81689716077da 100644 (file)
--- a/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
+++ b/drivers/gpu/drm/amd/powerplay/navi10_ppt.c
@@ -736,9 +736,9 @@ static bool navi10_is_support_fine_grained_dpm(struct smu_context *smu, enum smu
         return dpm_desc->SnapToDiscrete == 0 ? true : false;
  }
  
-static inline bool navi10_od_feature_is_supported(struct smu_11_0_overdrive_table *od_table, enum SMU_11_0_ODFEATURE_ID feature)
+static inline bool navi10_od_feature_is_supported(struct smu_11_0_overdrive_table *od_table, enum SMU_11_0_ODFEATURE_CAP cap)
  {
-       return od_table->cap[feature];
+       return od_table->cap[cap];
  }
  
  static void navi10_od_setting_get_range(struct smu_11_0_overdrive_table *od_table,
@@ -846,7 +846,7 @@ static int navi10_print_clk_levels(struct smu_context *smu,
         case SMU_OD_SCLK:
                 if (!smu->od_enabled || !od_table || !od_settings)
                         break;
-               if (!navi10_od_feature_is_supported(od_settings, SMU_11_0_ODFEATURE_GFXCLK_LIMITS))
+               if (!navi10_od_feature_is_supported(od_settings, SMU_11_0_ODCAP_GFXCLK_LIMITS))
                         break;
                 size += sprintf(buf + size, "OD_SCLK:\n");
                 size += sprintf(buf + size, "0: %uMhz\n1: %uMhz\n", od_table->GfxclkFmin, od_table->GfxclkFmax);
@@ -854,7 +854,7 @@ static int navi10_print_clk_levels(struct smu_context *smu,
         case SMU_OD_MCLK:
                 if (!smu->od_enabled || !od_table || !od_settings)
                         break;
-               if (!navi10_od_feature_is_supported(od_settings, SMU_11_0_ODFEATURE_UCLK_MAX))
+               if (!navi10_od_feature_is_supported(od_settings, SMU_11_0_ODCAP_UCLK_MAX))
                         break;
                 size += sprintf(buf + size, "OD_MCLK:\n");
                 size += sprintf(buf + size, "1: %uMHz\n", od_table->UclkFmax);
@@ -862,7 +862,7 @@ static int navi10_print_clk_levels(struct smu_context *smu,
         case SMU_OD_VDDC_CURVE:
                 if (!smu->od_enabled || !od_table || !od_settings)
                         break;
-               if (!navi10_od_feature_is_supported(od_settings, SMU_11_0_ODFEATURE_GFXCLK_CURVE))
+               if (!navi10_od_feature_is_supported(od_settings, SMU_11_0_ODCAP_GFXCLK_CURVE))
                         break;
                 size += sprintf(buf + size, "OD_VDDC_CURVE:\n");
                 for (i = 0; i < 3; i++) {
@@ -887,7 +887,7 @@ static int navi10_print_clk_levels(struct smu_context *smu,
                         break;
                 size = sprintf(buf, "%s:\n", "OD_RANGE");
  
-               if (navi10_od_feature_is_supported(od_settings, SMU_11_0_ODFEATURE_GFXCLK_LIMITS)) {
+               if (navi10_od_feature_is_supported(od_settings, SMU_11_0_ODCAP_GFXCLK_LIMITS)) {
                         navi10_od_setting_get_range(od_settings, SMU_11_0_ODSETTING_GFXCLKFMIN,
                                                     &min_value, NULL);
                         navi10_od_setting_get_range(od_settings, SMU_11_0_ODSETTING_GFXCLKFMAX,
@@ -896,14 +896,14 @@ static int navi10_print_clk_levels(struct smu_context *smu,
                                         min_value, max_value);
                 }
  
-               if (navi10_od_feature_is_supported(od_settings, SMU_11_0_ODFEATURE_UCLK_MAX)) {
+               if (navi10_od_feature_is_supported(od_settings, SMU_11_0_ODCAP_UCLK_MAX)) {
                         navi10_od_setting_get_range(od_settings, SMU_11_0_ODSETTING_UCLKFMAX,
                                                     &min_value, &max_value);
                         size += sprintf(buf + size, "MCLK: %7uMhz %10uMhz\n",
                                         min_value, max_value);
                 }
  
-               if (navi10_od_feature_is_supported(od_settings, SMU_11_0_ODFEATURE_GFXCLK_CURVE)) {
+               if (navi10_od_feature_is_supported(od_settings, SMU_11_0_ODCAP_GFXCLK_CURVE)) {
                         navi10_od_setting_get_range(od_settings, SMU_11_0_ODSETTING_VDDGFXCURVEFREQ_P1,
                                                     &min_value, &max_value);
                         size += sprintf(buf + size, "VDDC_CURVE_SCLK[0]: %7uMhz %10uMhz\n",
@@ -2056,7 +2056,7 @@ static int navi10_od_edit_dpm_table(struct smu_context *smu, enum PP_OD_DPM_TABL
  
         switch (type) {
         case PP_OD_EDIT_SCLK_VDDC_TABLE:
-               if (!navi10_od_feature_is_supported(od_settings, SMU_11_0_ODFEATURE_GFXCLK_LIMITS)) {
+               if (!navi10_od_feature_is_supported(od_settings, SMU_11_0_ODCAP_GFXCLK_LIMITS)) {
                         pr_warn("GFXCLK_LIMITS not supported!\n");
                         return -ENOTSUPP;
                 }
@@ -2102,7 +2102,7 @@ static int navi10_od_edit_dpm_table(struct smu_context *smu, enum PP_OD_DPM_TABL
                 }
                 break;
         case PP_OD_EDIT_MCLK_VDDC_TABLE:
-               if (!navi10_od_feature_is_supported(od_settings, SMU_11_0_ODFEATURE_UCLK_MAX)) {
+               if (!navi10_od_feature_is_supported(od_settings, SMU_11_0_ODCAP_UCLK_MAX)) {
                         pr_warn("UCLK_MAX not supported!\n");
                         return -ENOTSUPP;
                 }
@@ -2143,7 +2143,7 @@ static int navi10_od_edit_dpm_table(struct smu_context *smu, enum PP_OD_DPM_TABL
                 }
                 break;
         case PP_OD_EDIT_VDDC_CURVE:
-               if (!navi10_od_feature_is_supported(od_settings, SMU_11_0_ODFEATURE_GFXCLK_CURVE)) {
+               if (!navi10_od_feature_is_supported(od_settings, SMU_11_0_ODCAP_GFXCLK_CURVE)) {
                         pr_warn("GFXCLK_CURVE not supported!\n");
                         return -ENOTSUPP;
                 }
diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c

index 0dc49479a7ebdabe0650d06b29db032c91843845..c9e5ce135fd42bd417fa64b1fe43bed8fee6cf45 100644 (file)
--- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
+++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
@@ -898,6 +898,9 @@ int smu_v11_0_system_features_control(struct smu_context *smu,
         if (ret)
                 return ret;
  
+       bitmap_zero(feature->enabled, feature->feature_num);
+       bitmap_zero(feature->supported, feature->feature_num);
+
         if (en) {
                 ret = smu_feature_get_enabled_mask(smu, feature_mask, 2);
                 if (ret)
@@ -907,9 +910,6 @@ int smu_v11_0_system_features_control(struct smu_context *smu,
                             feature->feature_num);
                 bitmap_copy(feature->supported, (unsigned long *)&feature_mask,
                             feature->feature_num);
-       } else {
-               bitmap_zero(feature->enabled, feature->feature_num);
-               bitmap_zero(feature->supported, feature->feature_num);
         }
  
         return ret;
@@ -978,8 +978,12 @@ int smu_v11_0_init_max_sustainable_clocks(struct smu_context *smu)
         struct smu_11_0_max_sustainable_clocks *max_sustainable_clocks;
         int ret = 0;
  
-       max_sustainable_clocks = kzalloc(sizeof(struct smu_11_0_max_sustainable_clocks),
+       if (!smu->smu_table.max_sustainable_clocks)
+               max_sustainable_clocks = kzalloc(sizeof(struct smu_11_0_max_sustainable_clocks),
                                          GFP_KERNEL);
+       else
+               max_sustainable_clocks = smu->smu_table.max_sustainable_clocks;
+
         smu->smu_table.max_sustainable_clocks = (void *)max_sustainable_clocks;
  
         max_sustainable_clocks->uclock = smu->smu_table.boot_values.uclk / 100;
diff --git a/drivers/gpu/drm/bridge/tc358767.c b/drivers/gpu/drm/bridge/tc358767.c

index 3709e5ace7246086b3b4d172bdd5a7218cbf6306..fbdb42d4e772ecf87666fe28191559f06afd8cf1 100644 (file)
--- a/drivers/gpu/drm/bridge/tc358767.c
+++ b/drivers/gpu/drm/bridge/tc358767.c
@@ -297,7 +297,7 @@ static inline int tc_poll_timeout(struct tc_data *tc, unsigned int addr,
  
  static int tc_aux_wait_busy(struct tc_data *tc)
  {
-       return tc_poll_timeout(tc, DP0_AUXSTATUS, AUX_BUSY, 0, 1000, 100000);
+       return tc_poll_timeout(tc, DP0_AUXSTATUS, AUX_BUSY, 0, 100, 100000);
  }
  
  static int tc_aux_write_data(struct tc_data *tc, const void *data,
@@ -640,7 +640,7 @@ static int tc_aux_link_setup(struct tc_data *tc)
         if (ret)
                 goto err;
  
-       ret = tc_poll_timeout(tc, DP_PHY_CTRL, PHY_RDY, PHY_RDY, 1, 1000);
+       ret = tc_poll_timeout(tc, DP_PHY_CTRL, PHY_RDY, PHY_RDY, 100, 100000);
         if (ret == -ETIMEDOUT) {
                 dev_err(tc->dev, "Timeout waiting for PHY to become ready");
                 return ret;
@@ -876,7 +876,7 @@ static int tc_wait_link_training(struct tc_data *tc)
         int ret;
  
         ret = tc_poll_timeout(tc, DP0_LTSTAT, LT_LOOPDONE,
-                             LT_LOOPDONE, 1, 1000);
+                             LT_LOOPDONE, 500, 100000);
         if (ret) {
                 dev_err(tc->dev, "Link training timeout waiting for LT_LOOPDONE!\n");
                 return ret;
@@ -949,7 +949,7 @@ static int tc_main_link_enable(struct tc_data *tc)
         dp_phy_ctrl &= ~(DP_PHY_RST | PHY_M1_RST | PHY_M0_RST);
         ret = regmap_write(tc->regmap, DP_PHY_CTRL, dp_phy_ctrl);
  
-       ret = tc_poll_timeout(tc, DP_PHY_CTRL, PHY_RDY, PHY_RDY, 1, 1000);
+       ret = tc_poll_timeout(tc, DP_PHY_CTRL, PHY_RDY, PHY_RDY, 500, 100000);
         if (ret) {
                 dev_err(dev, "timeout waiting for phy become ready");
                 return ret;
diff --git a/drivers/gpu/drm/bridge/ti-tfp410.c b/drivers/gpu/drm/bridge/ti-tfp410.c

index 6f6d6d1e60ae9162d94c45e8e4e58cc6d389448b..f195a4732e0badac04f0864def9999bd1620b575 100644 (file)
--- a/drivers/gpu/drm/bridge/ti-tfp410.c
+++ b/drivers/gpu/drm/bridge/ti-tfp410.c
@@ -140,7 +140,8 @@ static int tfp410_attach(struct drm_bridge *bridge)
                                           dvi->connector_type,
                                           dvi->ddc);
         if (ret) {
-               dev_err(dvi->dev, "drm_connector_init() failed: %d\n", ret);
+               dev_err(dvi->dev, "drm_connector_init_with_ddc() failed: %d\n",
+                       ret);
                 return ret;
         }
  
diff --git a/drivers/gpu/drm/drm_client_modeset.c b/drivers/gpu/drm/drm_client_modeset.c

index 6d4a29e99ae264ea55985be4594a08319a53d6f1..3035584f6dc724ec81a5f7a7ffafc82745b043c6 100644 (file)
--- a/drivers/gpu/drm/drm_client_modeset.c
+++ b/drivers/gpu/drm/drm_client_modeset.c
@@ -951,7 +951,8 @@ bool drm_client_rotation(struct drm_mode_set *modeset, unsigned int *rotation)
          * depending on the hardware this may require the framebuffer
          * to be in a specific tiling format.
          */
-       if ((*rotation & DRM_MODE_ROTATE_MASK) != DRM_MODE_ROTATE_180 ||
+       if (((*rotation & DRM_MODE_ROTATE_MASK) != DRM_MODE_ROTATE_0 &&
+            (*rotation & DRM_MODE_ROTATE_MASK) != DRM_MODE_ROTATE_180) ||
             !plane->rotation_property)
                 return false;
  
diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c

index 20cdaf3146b844b7fc2cd7ad3f752fda681b831d..cce0b1bba591fea5df2cb55c3ed57e64ed5019bd 100644 (file)
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -3838,7 +3838,8 @@ drm_dp_mst_process_up_req(struct drm_dp_mst_topology_mgr *mgr,
                 else if (msg->req_type == DP_RESOURCE_STATUS_NOTIFY)
                         guid = msg->u.resource_stat.guid;
  
-               mstb = drm_dp_get_mst_branch_device_by_guid(mgr, guid);
+               if (guid)
+                       mstb = drm_dp_get_mst_branch_device_by_guid(mgr, guid);
         } else {
                 mstb = drm_dp_get_mst_branch_device(mgr, hdr->lct, hdr->rad);
         }
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c

index 99769d6c9f8462e25d189c78df3990b8519b7b79..805fb004c8eb922fca661d9dd1f56ded8115de1c 100644 (file)
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -3211,7 +3211,7 @@ static u8 *drm_find_cea_extension(const struct edid *edid)
         return cea;
  }
  
-static const struct drm_display_mode *cea_mode_for_vic(u8 vic)
+static __always_inline const struct drm_display_mode *cea_mode_for_vic(u8 vic)
  {
         BUILD_BUG_ON(1 + ARRAY_SIZE(edid_cea_modes_1) - 1 != 127);
         BUILD_BUG_ON(193 + ARRAY_SIZE(edid_cea_modes_193) - 1 != 219);
diff --git a/drivers/gpu/drm/drm_modes.c b/drivers/gpu/drm/drm_modes.c

index 10336b144c722b5aa03bca91d3d28685c009bf72..d4d64518e11b8fc06f45eddf2673a8c6ba0bfe34 100644 (file)
--- a/drivers/gpu/drm/drm_modes.c
+++ b/drivers/gpu/drm/drm_modes.c
@@ -1698,6 +1698,13 @@ static int drm_mode_parse_cmdline_options(const char *str,
         if (rotation && freestanding)
                 return -EINVAL;
  
+       if (!(rotation & DRM_MODE_ROTATE_MASK))
+               rotation |= DRM_MODE_ROTATE_0;
+
+       /* Make sure there is exactly one rotation defined */
+       if (!is_power_of_2(rotation & DRM_MODE_ROTATE_MASK))
+               return -EINVAL;
+
         mode->rotation_reflection = rotation;
  
         return 0;
diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig

index ba9595960bbebf290170a2bfb6247320a5fb8a21..907c4471f5916d742c4d65610374b80657fa74c9 100644 (file)
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -75,9 +75,8 @@ config DRM_I915_CAPTURE_ERROR
         help
           This option enables capturing the GPU state when a hang is detected.
           This information is vital for triaging hangs and assists in debugging.
-         Please report any hang to
-           https://bugs.freedesktop.org/enter_bug.cgi?product=DRI
-         for triaging.
+         Please report any hang for triaging according to:
+           https://gitlab.freedesktop.org/drm/intel/-/wikis/How-to-file-i915-bugs
  
           If in doubt, say "Y".
  
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile

index b8c5f8934dbdf61792997b1cb6b03650a6dca6bf..a1f2411aa21b26a034b27312077d5f3764bb9083 100644 (file)
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -294,7 +294,7 @@ extra-$(CONFIG_DRM_I915_WERROR) += \
                 $(shell cd $(srctree)/$(src) && find * -name '*.h')))
  
  quiet_cmd_hdrtest = HDRTEST $(patsubst %.hdrtest,%.h,$@)
-      cmd_hdrtest = $(CC) $(c_flags) -S -o /dev/null -x c /dev/null -include $<; touch $@
+      cmd_hdrtest = $(CC) $(filter-out $(CFLAGS_GCOV), $(c_flags)) -S -o /dev/null -x c /dev/null -include $<; touch $@
  
  $(obj)/%.hdrtest: $(src)/%.h FORCE
         $(call if_changed_dep,hdrtest)
diff --git a/drivers/gpu/drm/i915/display/intel_bios.c b/drivers/gpu/drm/i915/display/intel_bios.c

index 8beac06e3f10f213ae736f7fde35b309c51341c9..ef4017a1babaa10e17ae4271ffa8b94e00c158d6 100644 (file)
--- a/drivers/gpu/drm/i915/display/intel_bios.c
+++ b/drivers/gpu/drm/i915/display/intel_bios.c
@@ -357,14 +357,16 @@ parse_generic_dtd(struct drm_i915_private *dev_priv,
                 panel_fixed_mode->hdisplay + dtd->hfront_porch;
         panel_fixed_mode->hsync_end =
                 panel_fixed_mode->hsync_start + dtd->hsync;
-       panel_fixed_mode->htotal = panel_fixed_mode->hsync_end;
+       panel_fixed_mode->htotal =
+               panel_fixed_mode->hdisplay + dtd->hblank;
  
         panel_fixed_mode->vdisplay = dtd->vactive;
         panel_fixed_mode->vsync_start =
                 panel_fixed_mode->vdisplay + dtd->vfront_porch;
         panel_fixed_mode->vsync_end =
                 panel_fixed_mode->vsync_start + dtd->vsync;
-       panel_fixed_mode->vtotal = panel_fixed_mode->vsync_end;
+       panel_fixed_mode->vtotal =
+               panel_fixed_mode->vdisplay + dtd->vblank;
  
         panel_fixed_mode->clock = dtd->pixel_clock;
         panel_fixed_mode->width_mm = dtd->width_mm;
diff --git a/drivers/gpu/drm/i915/display/intel_ddi.c b/drivers/gpu/drm/i915/display/intel_ddi.c

index 33f1dc3d7c1a6d896054a83bc32a200689395527..d9a61f341070bd3df7fd1fddc0d68abf911fe526 100644 (file)
--- a/drivers/gpu/drm/i915/display/intel_ddi.c
+++ b/drivers/gpu/drm/i915/display/intel_ddi.c
@@ -4251,7 +4251,9 @@ static bool intel_ddi_is_audio_enabled(struct drm_i915_private *dev_priv,
  void intel_ddi_compute_min_voltage_level(struct drm_i915_private *dev_priv,
                                          struct intel_crtc_state *crtc_state)
  {
-       if (INTEL_GEN(dev_priv) >= 11 && crtc_state->port_clock > 594000)
+       if (IS_ELKHARTLAKE(dev_priv) && crtc_state->port_clock > 594000)
+               crtc_state->min_voltage_level = 3;
+       else if (INTEL_GEN(dev_priv) >= 11 && crtc_state->port_clock > 594000)
                 crtc_state->min_voltage_level = 1;
         else if (IS_CANNONLAKE(dev_priv) && crtc_state->port_clock > 594000)
                 crtc_state->min_voltage_level = 2;
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c

index 19ea842cfd849fae402046ad0fa406096c0aad5a..aa453953908b54ff27fe2d2d60a12639b4d31fc0 100644 (file)
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -11087,7 +11087,7 @@ static u32 intel_cursor_base(const struct intel_plane_state *plane_state)
         u32 base;
  
         if (INTEL_INFO(dev_priv)->display.cursor_needs_physical)
-               base = obj->phys_handle->busaddr;
+               base = sg_dma_address(obj->mm.pages->sgl);
         else
                 base = intel_plane_ggtt_offset(plane_state);
  
@@ -12366,6 +12366,7 @@ static int icl_check_nv12_planes(struct intel_crtc_state *crtc_state)
                 /* Copy parameters to slave plane */
                 linked_state->ctl = plane_state->ctl | PLANE_CTL_YUV420_Y_PLANE;
                 linked_state->color_ctl = plane_state->color_ctl;
+               linked_state->view = plane_state->view;
                 memcpy(linked_state->color_plane, plane_state->color_plane,
                        sizeof(linked_state->color_plane));
  
@@ -14476,37 +14477,23 @@ static int intel_atomic_check_crtcs(struct intel_atomic_state *state)
         return 0;
  }
  
-static bool intel_cpu_transcoder_needs_modeset(struct intel_atomic_state *state,
-                                              enum transcoder transcoder)
+static bool intel_cpu_transcoders_need_modeset(struct intel_atomic_state *state,
+                                              u8 transcoders)
  {
-       struct intel_crtc_state *new_crtc_state;
+       const struct intel_crtc_state *new_crtc_state;
         struct intel_crtc *crtc;
         int i;
  
-       for_each_new_intel_crtc_in_state(state, crtc, new_crtc_state, i)
-               if (new_crtc_state->cpu_transcoder == transcoder)
-                       return needs_modeset(new_crtc_state);
+       for_each_new_intel_crtc_in_state(state, crtc, new_crtc_state, i) {
+               if (new_crtc_state->hw.enable &&
+                   transcoders & BIT(new_crtc_state->cpu_transcoder) &&
+                   needs_modeset(new_crtc_state))
+                       return true;
+       }
  
         return false;
  }
  
-static void
-intel_modeset_synced_crtcs(struct intel_atomic_state *state,
-                          u8 transcoders)
-{
-       struct intel_crtc_state *new_crtc_state;
-       struct intel_crtc *crtc;
-       int i;
-
-       for_each_new_intel_crtc_in_state(state, crtc,
-                                        new_crtc_state, i) {
-               if (transcoders & BIT(new_crtc_state->cpu_transcoder)) {
-                       new_crtc_state->uapi.mode_changed = true;
-                       new_crtc_state->update_pipe = false;
-               }
-       }
-}
-
  static int
  intel_modeset_all_tiles(struct intel_atomic_state *state, int tile_grp_id)
  {
@@ -14662,15 +14649,20 @@ static int intel_atomic_check(struct drm_device *dev,
                 if (intel_dp_mst_is_slave_trans(new_crtc_state)) {
                         enum transcoder master = new_crtc_state->mst_master_transcoder;
  
-                       if (intel_cpu_transcoder_needs_modeset(state, master)) {
+                       if (intel_cpu_transcoders_need_modeset(state, BIT(master))) {
                                 new_crtc_state->uapi.mode_changed = true;
                                 new_crtc_state->update_pipe = false;
                         }
-               } else if (is_trans_port_sync_mode(new_crtc_state)) {
+               }
+
+               if (is_trans_port_sync_mode(new_crtc_state)) {
                         u8 trans = new_crtc_state->sync_mode_slaves_mask |
                                    BIT(new_crtc_state->master_transcoder);
  
-                       intel_modeset_synced_crtcs(state, trans);
+                       if (intel_cpu_transcoders_need_modeset(state, trans)) {
+                               new_crtc_state->uapi.mode_changed = true;
+                               new_crtc_state->update_pipe = false;
+                       }
                 }
         }
  
@@ -17441,6 +17433,24 @@ retry:
                          * have readout for pipe gamma enable.
                          */
                         crtc_state->uapi.color_mgmt_changed = true;
+
+                       /*
+                        * FIXME hack to force full modeset when DSC is being
+                        * used.
+                        *
+                        * As long as we do not have full state readout and
+                        * config comparison of crtc_state->dsc, we have no way
+                        * to ensure reliable fastset. Remove once we have
+                        * readout for DSC.
+                        */
+                       if (crtc_state->dsc.compression_enable) {
+                               ret = drm_atomic_add_affected_connectors(state,
+                                                                        &crtc->base);
+                               if (ret)
+                                       goto out;
+                               crtc_state->uapi.mode_changed = true;
+                               drm_dbg_kms(dev, "Force full modeset for DSC\n");
+                       }
                 }
         }
  
diff --git a/drivers/gpu/drm/i915/display/intel_dsi_vbt.c b/drivers/gpu/drm/i915/display/intel_dsi_vbt.c

index 89fb0d90b694ab17e038afb9b0ee14681c327977..04f953ba8f0027fb143b9327912cea19e0002c2a 100644 (file)
--- a/drivers/gpu/drm/i915/display/intel_dsi_vbt.c
+++ b/drivers/gpu/drm/i915/display/intel_dsi_vbt.c
@@ -384,6 +384,7 @@ static const u8 *mipi_exec_gpio(struct intel_dsi *intel_dsi, const u8 *data)
         return data;
  }
  
+#ifdef CONFIG_ACPI
  static int i2c_adapter_lookup(struct acpi_resource *ares, void *data)
  {
         struct i2c_adapter_lookup *lookup = data;
@@ -393,8 +394,7 @@ static int i2c_adapter_lookup(struct acpi_resource *ares, void *data)
         acpi_handle adapter_handle;
         acpi_status status;
  
-       if (intel_dsi->i2c_bus_num >= 0 ||
-           !i2c_acpi_get_i2c_resource(ares, &sb))
+       if (!i2c_acpi_get_i2c_resource(ares, &sb))
                 return 1;
  
         if (lookup->slave_addr != sb->slave_address)
@@ -413,14 +413,41 @@ static int i2c_adapter_lookup(struct acpi_resource *ares, void *data)
         return 1;
  }
  
-static const u8 *mipi_exec_i2c(struct intel_dsi *intel_dsi, const u8 *data)
+static void i2c_acpi_find_adapter(struct intel_dsi *intel_dsi,
+                                 const u16 slave_addr)
  {
         struct drm_device *drm_dev = intel_dsi->base.base.dev;
         struct device *dev = &drm_dev->pdev->dev;
-       struct i2c_adapter *adapter;
         struct acpi_device *acpi_dev;
         struct list_head resource_list;
         struct i2c_adapter_lookup lookup;
+
+       acpi_dev = ACPI_COMPANION(dev);
+       if (acpi_dev) {
+               memset(&lookup, 0, sizeof(lookup));
+               lookup.slave_addr = slave_addr;
+               lookup.intel_dsi = intel_dsi;
+               lookup.dev_handle = acpi_device_handle(acpi_dev);
+
+               INIT_LIST_HEAD(&resource_list);
+               acpi_dev_get_resources(acpi_dev, &resource_list,
+                                      i2c_adapter_lookup,
+                                      &lookup);
+               acpi_dev_free_resource_list(&resource_list);
+       }
+}
+#else
+static inline void i2c_acpi_find_adapter(struct intel_dsi *intel_dsi,
+                                        const u16 slave_addr)
+{
+}
+#endif
+
+static const u8 *mipi_exec_i2c(struct intel_dsi *intel_dsi, const u8 *data)
+{
+       struct drm_device *drm_dev = intel_dsi->base.base.dev;
+       struct device *dev = &drm_dev->pdev->dev;
+       struct i2c_adapter *adapter;
         struct i2c_msg msg;
         int ret;
         u8 vbt_i2c_bus_num = *(data + 2);
@@ -431,20 +458,7 @@ static const u8 *mipi_exec_i2c(struct intel_dsi *intel_dsi, const u8 *data)
  
         if (intel_dsi->i2c_bus_num < 0) {
                 intel_dsi->i2c_bus_num = vbt_i2c_bus_num;
-
-               acpi_dev = ACPI_COMPANION(dev);
-               if (acpi_dev) {
-                       memset(&lookup, 0, sizeof(lookup));
-                       lookup.slave_addr = slave_addr;
-                       lookup.intel_dsi = intel_dsi;
-                       lookup.dev_handle = acpi_device_handle(acpi_dev);
-
-                       INIT_LIST_HEAD(&resource_list);
-                       acpi_dev_get_resources(acpi_dev, &resource_list,
-                                              i2c_adapter_lookup,
-                                              &lookup);
-                       acpi_dev_free_resource_list(&resource_list);
-               }
+               i2c_acpi_find_adapter(intel_dsi, slave_addr);
         }
  
         adapter = i2c_get_adapter(intel_dsi->i2c_bus_num);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c

index a2e57e62af30adc05e562ca734f3891ceda33fe2..151a1e8ae36abbc7cafea3c9aa9c4b7cb88014c4 100644 (file)
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c
@@ -565,6 +565,22 @@ static int __context_set_persistence(struct i915_gem_context *ctx, bool state)
                 if (!(ctx->i915->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
                         return -ENODEV;
  
+               /*
+                * If the cancel fails, we then need to reset, cleanly!
+                *
+                * If the per-engine reset fails, all hope is lost! We resort
+                * to a full GPU reset in that unlikely case, but realistically
+                * if the engine could not reset, the full reset does not fare
+                * much better. The damage has been done.
+                *
+                * However, if we cannot reset an engine by itself, we cannot
+                * cleanup a hanging persistent context without causing
+                * colateral damage, and we should not pretend we can by
+                * exposing the interface.
+                */
+               if (!intel_has_reset_engine(&ctx->i915->gt))
+                       return -ENODEV;
+
                 i915_gem_context_clear_persistence(ctx);
         }
  
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c

index d5a0f5ae4a8ba3765ecdcdfd8933754696a2c16b..60c984e10c4ae47e8ba2a965d923549cfc1e9835 100644 (file)
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -1981,9 +1981,20 @@ static int __eb_parse(struct dma_fence_work *work)
                                        pw->trampoline);
  }
  
+static void __eb_parse_release(struct dma_fence_work *work)
+{
+       struct eb_parse_work *pw = container_of(work, typeof(*pw), base);
+
+       if (pw->trampoline)
+               i915_active_release(&pw->trampoline->active);
+       i915_active_release(&pw->shadow->active);
+       i915_active_release(&pw->batch->active);
+}
+
  static const struct dma_fence_work_ops eb_parse_ops = {
         .name = "eb_parse",
         .work = __eb_parse,
+       .release = __eb_parse_release,
  };
  
  static int eb_parse_pipeline(struct i915_execbuffer *eb,
@@ -1997,6 +2008,20 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb,
         if (!pw)
                 return -ENOMEM;
  
+       err = i915_active_acquire(&eb->batch->active);
+       if (err)
+               goto err_free;
+
+       err = i915_active_acquire(&shadow->active);
+       if (err)
+               goto err_batch;
+
+       if (trampoline) {
+               err = i915_active_acquire(&trampoline->active);
+               if (err)
+                       goto err_shadow;
+       }
+
         dma_fence_work_init(&pw->base, &eb_parse_ops);
  
         pw->engine = eb->engine;
@@ -2006,7 +2031,9 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb,
         pw->shadow = shadow;
         pw->trampoline = trampoline;
  
-       dma_resv_lock(pw->batch->resv, NULL);
+       err = dma_resv_lock_interruptible(pw->batch->resv, NULL);
+       if (err)
+               goto err_trampoline;
  
         err = dma_resv_reserve_shared(pw->batch->resv, 1);
         if (err)
@@ -2034,6 +2061,14 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb,
  
  err_batch_unlock:
         dma_resv_unlock(pw->batch->resv);
+err_trampoline:
+       if (trampoline)
+               i915_active_release(&trampoline->active);
+err_shadow:
+       i915_active_release(&shadow->active);
+err_batch:
+       i915_active_release(&eb->batch->active);
+err_free:
         kfree(pw);
         return err;
  }
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c

index b9fdac2f900364b70897eba0179b1a450f9bda82..0b6a442108de0ee14805438aacc59c919ce4e26e 100644 (file)
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -455,10 +455,11 @@ out:
  
  void i915_gem_object_release_mmap_offset(struct drm_i915_gem_object *obj)
  {
-       struct i915_mmap_offset *mmo;
+       struct i915_mmap_offset *mmo, *mn;
  
         spin_lock(&obj->mmo.lock);
-       list_for_each_entry(mmo, &obj->mmo.offsets, offset) {
+       rbtree_postorder_for_each_entry_safe(mmo, mn,
+                                            &obj->mmo.offsets, offset) {
                 /*
                  * vma_node_unmap for GTT mmaps handled already in
                  * __i915_gem_object_release_mmap_gtt
@@ -487,6 +488,67 @@ void i915_gem_object_release_mmap(struct drm_i915_gem_object *obj)
         i915_gem_object_release_mmap_offset(obj);
  }
  
+static struct i915_mmap_offset *
+lookup_mmo(struct drm_i915_gem_object *obj,
+          enum i915_mmap_type mmap_type)
+{
+       struct rb_node *rb;
+
+       spin_lock(&obj->mmo.lock);
+       rb = obj->mmo.offsets.rb_node;
+       while (rb) {
+               struct i915_mmap_offset *mmo =
+                       rb_entry(rb, typeof(*mmo), offset);
+
+               if (mmo->mmap_type == mmap_type) {
+                       spin_unlock(&obj->mmo.lock);
+                       return mmo;
+               }
+
+               if (mmo->mmap_type < mmap_type)
+                       rb = rb->rb_right;
+               else
+                       rb = rb->rb_left;
+       }
+       spin_unlock(&obj->mmo.lock);
+
+       return NULL;
+}
+
+static struct i915_mmap_offset *
+insert_mmo(struct drm_i915_gem_object *obj, struct i915_mmap_offset *mmo)
+{
+       struct rb_node *rb, **p;
+
+       spin_lock(&obj->mmo.lock);
+       rb = NULL;
+       p = &obj->mmo.offsets.rb_node;
+       while (*p) {
+               struct i915_mmap_offset *pos;
+
+               rb = *p;
+               pos = rb_entry(rb, typeof(*pos), offset);
+
+               if (pos->mmap_type == mmo->mmap_type) {
+                       spin_unlock(&obj->mmo.lock);
+                       drm_vma_offset_remove(obj->base.dev->vma_offset_manager,
+                                             &mmo->vma_node);
+                       kfree(mmo);
+                       return pos;
+               }
+
+               if (pos->mmap_type < mmo->mmap_type)
+                       p = &rb->rb_right;
+               else
+                       p = &rb->rb_left;
+       }
+       rb_link_node(&mmo->offset, rb, p);
+       rb_insert_color(&mmo->offset, &obj->mmo.offsets);
+       spin_unlock(&obj->mmo.lock);
+
+       return mmo;
+}
+
  static struct i915_mmap_offset *
  mmap_offset_attach(struct drm_i915_gem_object *obj,
                    enum i915_mmap_type mmap_type,
@@ -496,20 +558,22 @@ mmap_offset_attach(struct drm_i915_gem_object *obj,
         struct i915_mmap_offset *mmo;
         int err;
  
+       mmo = lookup_mmo(obj, mmap_type);
+       if (mmo)
+               goto out;
+
         mmo = kmalloc(sizeof(*mmo), GFP_KERNEL);
         if (!mmo)
                 return ERR_PTR(-ENOMEM);
  
         mmo->obj = obj;
-       mmo->dev = obj->base.dev;
-       mmo->file = file;
         mmo->mmap_type = mmap_type;
         drm_vma_node_reset(&mmo->vma_node);
  
-       err = drm_vma_offset_add(mmo->dev->vma_offset_manager, &mmo->vma_node,
-                                obj->base.size / PAGE_SIZE);
+       err = drm_vma_offset_add(obj->base.dev->vma_offset_manager,
+                                &mmo->vma_node, obj->base.size / PAGE_SIZE);
         if (likely(!err))
-               goto out;
+               goto insert;
  
         /* Attempt to reap some mmap space from dead objects */
         err = intel_gt_retire_requests_timeout(&i915->gt, MAX_SCHEDULE_TIMEOUT);
@@ -517,19 +581,17 @@ mmap_offset_attach(struct drm_i915_gem_object *obj,
                 goto err;
  
         i915_gem_drain_freed_objects(i915);
-       err = drm_vma_offset_add(mmo->dev->vma_offset_manager, &mmo->vma_node,
-                                obj->base.size / PAGE_SIZE);
+       err = drm_vma_offset_add(obj->base.dev->vma_offset_manager,
+                                &mmo->vma_node, obj->base.size / PAGE_SIZE);
         if (err)
                 goto err;
  
+insert:
+       mmo = insert_mmo(obj, mmo);
+       GEM_BUG_ON(lookup_mmo(obj, mmap_type) != mmo);
  out:
         if (file)
                 drm_vma_node_allow(&mmo->vma_node, file);
-
-       spin_lock(&obj->mmo.lock);
-       list_add(&mmo->offset, &obj->mmo.offsets);
-       spin_unlock(&obj->mmo.lock);
-
         return mmo;
  
  err:
@@ -745,60 +807,43 @@ int i915_gem_mmap(struct file *filp, struct vm_area_struct *vma)
         struct drm_vma_offset_node *node;
         struct drm_file *priv = filp->private_data;
         struct drm_device *dev = priv->minor->dev;
+       struct drm_i915_gem_object *obj = NULL;
         struct i915_mmap_offset *mmo = NULL;
-       struct drm_gem_object *obj = NULL;
         struct file *anon;
  
         if (drm_dev_is_unplugged(dev))
                 return -ENODEV;
  
+       rcu_read_lock();
         drm_vma_offset_lock_lookup(dev->vma_offset_manager);
         node = drm_vma_offset_exact_lookup_locked(dev->vma_offset_manager,
                                                   vma->vm_pgoff,
                                                   vma_pages(vma));
-       if (likely(node)) {
-               mmo = container_of(node, struct i915_mmap_offset,
-                                  vma_node);
-               /*
-                * In our dependency chain, the drm_vma_offset_node
-                * depends on the validity of the mmo, which depends on
-                * the gem object. However the only reference we have
-                * at this point is the mmo (as the parent of the node).
-                * Try to check if the gem object was at least cleared.
-                */
-               if (!mmo || !mmo->obj) {
-                       drm_vma_offset_unlock_lookup(dev->vma_offset_manager);
-                       return -EINVAL;
-               }
+       if (node && drm_vma_node_is_allowed(node, priv)) {
                 /*
                  * Skip 0-refcnted objects as it is in the process of being
                  * destroyed and will be invalid when the vma manager lock
                  * is released.
                  */
-               obj = &mmo->obj->base;
-               if (!kref_get_unless_zero(&obj->refcount))
-                       obj = NULL;
+               mmo = container_of(node, struct i915_mmap_offset, vma_node);
+               obj = i915_gem_object_get_rcu(mmo->obj);
         }
         drm_vma_offset_unlock_lookup(dev->vma_offset_manager);
+       rcu_read_unlock();
         if (!obj)
-               return -EINVAL;
-
-       if (!drm_vma_node_is_allowed(node, priv)) {
-               drm_gem_object_put_unlocked(obj);
-               return -EACCES;
-       }
+               return node ? -EACCES : -EINVAL;
  
-       if (i915_gem_object_is_readonly(to_intel_bo(obj))) {
+       if (i915_gem_object_is_readonly(obj)) {
                 if (vma->vm_flags & VM_WRITE) {
-                       drm_gem_object_put_unlocked(obj);
+                       i915_gem_object_put(obj);
                         return -EINVAL;
                 }
                 vma->vm_flags &= ~VM_MAYWRITE;
         }
  
-       anon = mmap_singleton(to_i915(obj->dev));
+       anon = mmap_singleton(to_i915(dev));
         if (IS_ERR(anon)) {
-               drm_gem_object_put_unlocked(obj);
+               i915_gem_object_put(obj);
                 return PTR_ERR(anon);
         }
  
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c

index 46bacc82ddc406b686d65bc8d7260988f0770b2a..35985218bd8570dc7e6af8f08053f71feaf19a1c 100644 (file)
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c
@@ -63,7 +63,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
         INIT_LIST_HEAD(&obj->lut_list);
  
         spin_lock_init(&obj->mmo.lock);
-       INIT_LIST_HEAD(&obj->mmo.offsets);
+       obj->mmo.offsets = RB_ROOT;
  
         init_rcu_head(&obj->rcu);
  
@@ -100,8 +100,8 @@ void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file)
  {
         struct drm_i915_gem_object *obj = to_intel_bo(gem);
         struct drm_i915_file_private *fpriv = file->driver_priv;
+       struct i915_mmap_offset *mmo, *mn;
         struct i915_lut_handle *lut, *ln;
-       struct i915_mmap_offset *mmo;
         LIST_HEAD(close);
  
         i915_gem_object_lock(obj);
@@ -117,14 +117,8 @@ void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file)
         i915_gem_object_unlock(obj);
  
         spin_lock(&obj->mmo.lock);
-       list_for_each_entry(mmo, &obj->mmo.offsets, offset) {
-               if (mmo->file != file)
-                       continue;
-
-               spin_unlock(&obj->mmo.lock);
+       rbtree_postorder_for_each_entry_safe(mmo, mn, &obj->mmo.offsets, offset)
                 drm_vma_node_revoke(&mmo->vma_node, file);
-               spin_lock(&obj->mmo.lock);
-       }
         spin_unlock(&obj->mmo.lock);
  
         list_for_each_entry_safe(lut, ln, &close, obj_link) {
@@ -203,12 +197,14 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915,
  
                 i915_gem_object_release_mmap(obj);
  
-               list_for_each_entry_safe(mmo, mn, &obj->mmo.offsets, offset) {
+               rbtree_postorder_for_each_entry_safe(mmo, mn,
+                                                    &obj->mmo.offsets,
+                                                    offset) {
                         drm_vma_offset_remove(obj->base.dev->vma_offset_manager,
                                               &mmo->vma_node);
                         kfree(mmo);
                 }
-               INIT_LIST_HEAD(&obj->mmo.offsets);
+               obj->mmo.offsets = RB_ROOT;
  
                 GEM_BUG_ON(atomic_read(&obj->bind_count));
                 GEM_BUG_ON(obj->userfault_count);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h

index db70a3306e5939e97907b3f2e3997fd82f760b14..9c86f2dea947b415de4524718c174adbc30f5e44 100644 (file)
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -69,6 +69,15 @@ i915_gem_object_lookup_rcu(struct drm_file *file, u32 handle)
         return idr_find(&file->object_idr, handle);
  }
  
+static inline struct drm_i915_gem_object *
+i915_gem_object_get_rcu(struct drm_i915_gem_object *obj)
+{
+       if (obj && !kref_get_unless_zero(&obj->base.refcount))
+               obj = NULL;
+
+       return obj;
+}
+
  static inline struct drm_i915_gem_object *
  i915_gem_object_lookup(struct drm_file *file, u32 handle)
  {
@@ -76,8 +85,7 @@ i915_gem_object_lookup(struct drm_file *file, u32 handle)
  
         rcu_read_lock();
         obj = i915_gem_object_lookup_rcu(file, handle);
-       if (obj && !kref_get_unless_zero(&obj->base.refcount))
-               obj = NULL;
+       obj = i915_gem_object_get_rcu(obj);
         rcu_read_unlock();
  
         return obj;
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h

index 88e268633fdc742b559597073d954d89d33d2767..c2174da35bb0fdf3890af589440be127676df876 100644 (file)
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -71,13 +71,11 @@ enum i915_mmap_type {
  };
  
  struct i915_mmap_offset {
-       struct drm_device *dev;
         struct drm_vma_offset_node vma_node;
         struct drm_i915_gem_object *obj;
-       struct drm_file *file;
         enum i915_mmap_type mmap_type;
  
-       struct list_head offset;
+       struct rb_node offset;
  };
  
  struct drm_i915_gem_object {
@@ -137,7 +135,7 @@ struct drm_i915_gem_object {
  
         struct {
                 spinlock_t lock; /* Protects access to mmo offsets */
-               struct list_head offsets;
+               struct rb_root offsets;
         } mmo;
  
         I915_SELFTEST_DECLARE(struct list_head st_link);
@@ -287,9 +285,6 @@ struct drm_i915_gem_object {
  
                 void *gvt_info;
         };
-
-       /** for phys allocated objects */
-       struct drm_dma_handle *phys_handle;
  };
  
  static inline struct drm_i915_gem_object *
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_phys.c b/drivers/gpu/drm/i915/gem/i915_gem_phys.c

index b1b7c1b3038aaa41b8bbdc1c5c7cdfef13d4e2d0..b07bb40edd5a3d8ad83fe9ed0d2b93b96f6c9ecd 100644 (file)
--- a/drivers/gpu/drm/i915/gem/i915_gem_phys.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_phys.c
@@ -22,88 +22,87 @@
  static int i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
  {
         struct address_space *mapping = obj->base.filp->f_mapping;
-       struct drm_dma_handle *phys;
-       struct sg_table *st;
         struct scatterlist *sg;
-       char *vaddr;
+       struct sg_table *st;
+       dma_addr_t dma;
+       void *vaddr;
+       void *dst;
         int i;
-       int err;
  
         if (WARN_ON(i915_gem_object_needs_bit17_swizzle(obj)))
                 return -EINVAL;
  
-       /* Always aligning to the object size, allows a single allocation
+       /*
+        * Always aligning to the object size, allows a single allocation
          * to handle all possible callers, and given typical object sizes,
          * the alignment of the buddy allocation will naturally match.
          */
-       phys = drm_pci_alloc(obj->base.dev,
-                            roundup_pow_of_two(obj->base.size),
-                            roundup_pow_of_two(obj->base.size));
-       if (!phys)
+       vaddr = dma_alloc_coherent(&obj->base.dev->pdev->dev,
+                                  roundup_pow_of_two(obj->base.size),
+                                  &dma, GFP_KERNEL);
+       if (!vaddr)
                 return -ENOMEM;
  
-       vaddr = phys->vaddr;
+       st = kmalloc(sizeof(*st), GFP_KERNEL);
+       if (!st)
+               goto err_pci;
+
+       if (sg_alloc_table(st, 1, GFP_KERNEL))
+               goto err_st;
+
+       sg = st->sgl;
+       sg->offset = 0;
+       sg->length = obj->base.size;
+
+       sg_assign_page(sg, (struct page *)vaddr);
+       sg_dma_address(sg) = dma;
+       sg_dma_len(sg) = obj->base.size;
+
+       dst = vaddr;
         for (i = 0; i < obj->base.size / PAGE_SIZE; i++) {
                 struct page *page;
-               char *src;
+               void *src;
  
                 page = shmem_read_mapping_page(mapping, i);
-               if (IS_ERR(page)) {
-                       err = PTR_ERR(page);
-                       goto err_phys;
-               }
+               if (IS_ERR(page))
+                       goto err_st;
  
                 src = kmap_atomic(page);
-               memcpy(vaddr, src, PAGE_SIZE);
-               drm_clflush_virt_range(vaddr, PAGE_SIZE);
+               memcpy(dst, src, PAGE_SIZE);
+               drm_clflush_virt_range(dst, PAGE_SIZE);
                 kunmap_atomic(src);
  
                 put_page(page);
-               vaddr += PAGE_SIZE;
+               dst += PAGE_SIZE;
         }
  
         intel_gt_chipset_flush(&to_i915(obj->base.dev)->gt);
  
-       st = kmalloc(sizeof(*st), GFP_KERNEL);
-       if (!st) {
-               err = -ENOMEM;
-               goto err_phys;
-       }
-
-       if (sg_alloc_table(st, 1, GFP_KERNEL)) {
-               kfree(st);
-               err = -ENOMEM;
-               goto err_phys;
-       }
-
-       sg = st->sgl;
-       sg->offset = 0;
-       sg->length = obj->base.size;
-
-       sg_dma_address(sg) = phys->busaddr;
-       sg_dma_len(sg) = obj->base.size;
-
-       obj->phys_handle = phys;
-
         __i915_gem_object_set_pages(obj, st, sg->length);
  
         return 0;
  
-err_phys:
-       drm_pci_free(obj->base.dev, phys);
-
-       return err;
+err_st:
+       kfree(st);
+err_pci:
+       dma_free_coherent(&obj->base.dev->pdev->dev,
+                         roundup_pow_of_two(obj->base.size),
+                         vaddr, dma);
+       return -ENOMEM;
  }
  
  static void
  i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj,
                                struct sg_table *pages)
  {
+       dma_addr_t dma = sg_dma_address(pages->sgl);
+       void *vaddr = sg_page(pages->sgl);
+
         __i915_gem_object_release_shmem(obj, pages, false);
  
         if (obj->mm.dirty) {
                 struct address_space *mapping = obj->base.filp->f_mapping;
-               char *vaddr = obj->phys_handle->vaddr;
+               void *src = vaddr;
                 int i;
  
                 for (i = 0; i < obj->base.size / PAGE_SIZE; i++) {
@@ -115,15 +114,16 @@ i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj,
                                 continue;
  
                         dst = kmap_atomic(page);
-                       drm_clflush_virt_range(vaddr, PAGE_SIZE);
-                       memcpy(dst, vaddr, PAGE_SIZE);
+                       drm_clflush_virt_range(src, PAGE_SIZE);
+                       memcpy(dst, src, PAGE_SIZE);
                         kunmap_atomic(dst);
  
                         set_page_dirty(page);
                         if (obj->mm.madv == I915_MADV_WILLNEED)
                                 mark_page_accessed(page);
                         put_page(page);
-                       vaddr += PAGE_SIZE;
+
+                       src += PAGE_SIZE;
                 }
                 obj->mm.dirty = false;
         }
@@ -131,7 +131,9 @@ i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj,
         sg_free_table(pages);
         kfree(pages);
  
-       drm_pci_free(obj->base.dev, obj->phys_handle);
+       dma_free_coherent(&obj->base.dev->pdev->dev,
+                         roundup_pow_of_two(obj->base.size),
+                         vaddr, dma);
  }
  
  static void phys_release(struct drm_i915_gem_object *obj)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c

index f7e4b39c734f36e76d79d0ae2c250d709fb7ba8d..59b387ade49c079a1c605cb8754a617186ed3407 100644 (file)
--- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c
@@ -256,8 +256,7 @@ unsigned long i915_gem_shrink_all(struct drm_i915_private *i915)
         with_intel_runtime_pm(&i915->runtime_pm, wakeref) {
                 freed = i915_gem_shrink(i915, -1UL, NULL,
                                         I915_SHRINK_BOUND |
-                                       I915_SHRINK_UNBOUND |
-                                       I915_SHRINK_ACTIVE);
+                                       I915_SHRINK_UNBOUND);
         }
  
         return freed;
@@ -336,7 +335,6 @@ i915_gem_shrinker_oom(struct notifier_block *nb, unsigned long event, void *ptr)
         freed_pages = 0;
         with_intel_runtime_pm(&i915->runtime_pm, wakeref)
                 freed_pages += i915_gem_shrink(i915, -1UL, NULL,
-                                              I915_SHRINK_ACTIVE |
                                                I915_SHRINK_BOUND |
                                                I915_SHRINK_UNBOUND |
                                                I915_SHRINK_WRITEBACK);
diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c

index 0ba524a414c68d66558afe24674613a2229451ee..cbad7fe722cebb840f50c46ad73f9f1c63c49c78 100644 (file)
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -136,6 +136,9 @@ static void add_retire(struct intel_breadcrumbs *b, struct intel_timeline *tl)
         struct intel_engine_cs *engine =
                 container_of(b, struct intel_engine_cs, breadcrumbs);
  
+       if (unlikely(intel_engine_is_virtual(engine)))
+               engine = intel_virtual_engine_get_sibling(engine, 0);
+
         intel_engine_add_retire(engine, tl);
  }
  
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c

index 23137b2a8689741e64276f3a88eeb4102d0693bd..57e8a051ddc2abbee284e3efd2f0c80f1e3e40ff 100644 (file)
--- a/drivers/gpu/drm/i915/gt/intel_context.c
+++ b/drivers/gpu/drm/i915/gt/intel_context.c
@@ -67,21 +67,18 @@ static int intel_context_active_acquire(struct intel_context *ce)
  {
         int err;
  
-       err = i915_active_acquire(&ce->active);
-       if (err)
-               return err;
+       __i915_active_acquire(&ce->active);
+
+       if (intel_context_is_barrier(ce))
+               return 0;
  
         /* Preallocate tracking nodes */
-       if (!intel_context_is_barrier(ce)) {
-               err = i915_active_acquire_preallocate_barrier(&ce->active,
-                                                             ce->engine);
-               if (err) {
-                       i915_active_release(&ce->active);
-                       return err;
-               }
-       }
+       err = i915_active_acquire_preallocate_barrier(&ce->active,
+                                                     ce->engine);
+       if (err)
+               i915_active_release(&ce->active);
  
-       return 0;
+       return err;
  }
  
  static void intel_context_active_release(struct intel_context *ce)
@@ -101,13 +98,19 @@ int __intel_context_do_pin(struct intel_context *ce)
                         return err;
         }
  
-       if (mutex_lock_interruptible(&ce->pin_mutex))
-               return -EINTR;
+       err = i915_active_acquire(&ce->active);
+       if (err)
+               return err;
+
+       if (mutex_lock_interruptible(&ce->pin_mutex)) {
+               err = -EINTR;
+               goto out_release;
+       }
  
-       if (likely(!atomic_read(&ce->pin_count))) {
+       if (likely(!atomic_add_unless(&ce->pin_count, 1, 0))) {
                 err = intel_context_active_acquire(ce);
                 if (unlikely(err))
-                       goto err;
+                       goto out_unlock;
  
                 err = ce->ops->pin(ce);
                 if (unlikely(err))
@@ -117,18 +120,19 @@ int __intel_context_do_pin(struct intel_context *ce)
                          ce->ring->head, ce->ring->tail);
  
                 smp_mb__before_atomic(); /* flush pin before it is visible */
+               atomic_inc(&ce->pin_count);
         }
  
-       atomic_inc(&ce->pin_count);
         GEM_BUG_ON(!intel_context_is_pinned(ce)); /* no overflow! */
-
-       mutex_unlock(&ce->pin_mutex);
-       return 0;
+       GEM_BUG_ON(i915_active_is_idle(&ce->active));
+       goto out_unlock;
  
  err_active:
         intel_context_active_release(ce);
-err:
+out_unlock:
         mutex_unlock(&ce->pin_mutex);
+out_release:
+       i915_active_release(&ce->active);
         return err;
  }
  
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c

index f451ef376548e19207233333d4231fcab2364624..06ff7695fa290b8ad68c4a9a7f2667eb41aed8a5 100644 (file)
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -671,6 +671,7 @@ void
  intel_engine_init_active(struct intel_engine_cs *engine, unsigned int subclass)
  {
         INIT_LIST_HEAD(&engine->active.requests);
+       INIT_LIST_HEAD(&engine->active.hold);
  
         spin_lock_init(&engine->active.lock);
         lockdep_set_subclass(&engine->active.lock, subclass);
@@ -1422,6 +1423,17 @@ static void print_request_ring(struct drm_printer *m, struct i915_request *rq)
         }
  }
  
+static unsigned long list_count(struct list_head *list)
+{
+       struct list_head *pos;
+       unsigned long count = 0;
+
+       list_for_each(pos, list)
+               count++;
+
+       return count;
+}
+
  void intel_engine_dump(struct intel_engine_cs *engine,
                        struct drm_printer *m,
                        const char *header, ...)
@@ -1491,6 +1503,7 @@ void intel_engine_dump(struct intel_engine_cs *engine,
                         hexdump(m, rq->context->lrc_reg_state, PAGE_SIZE);
                 }
         }
+       drm_printf(m, "\tOn hold?: %lu\n", list_count(&engine->active.hold));
         spin_unlock_irqrestore(&engine->active.lock, flags);
  
         drm_printf(m, "\tMMIO base:  0x%08x\n", engine->mmio_base);
diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h

index 350da59e605b768978008d80187baeb09d8d0f2c..92be41a6903c0c6991c3e9d43ff61d5ca94f84e3 100644 (file)
--- a/drivers/gpu/drm/i915/gt/intel_engine_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h
@@ -295,6 +295,7 @@ struct intel_engine_cs {
         struct {
                 spinlock_t lock;
                 struct list_head requests;
+               struct list_head hold; /* ready requests, but on hold */
         } active;
  
         struct llist_head barrier_tasks;
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_requests.c b/drivers/gpu/drm/i915/gt/intel_gt_requests.c

index 7ef1d37970f6d4942d3ec2d1028f2e4d9467a8f1..8a5054f21bf880644a0bc3002971bf63491406ff 100644 (file)
--- a/drivers/gpu/drm/i915/gt/intel_gt_requests.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_requests.c
@@ -99,6 +99,9 @@ static bool add_retire(struct intel_engine_cs *engine,
  void intel_engine_add_retire(struct intel_engine_cs *engine,
                              struct intel_timeline *tl)
  {
+       /* We don't deal well with the engine disappearing beneath us */
+       GEM_BUG_ON(intel_engine_is_virtual(engine));
+
         if (add_retire(engine, tl))
                 schedule_work(&engine->retire_work);
  }
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c

index 0cf0f6fae675ab1975639a0774417a393dc9313b..fe8a59aaa629a1bf8d01367c796307234302bdac 100644 (file)
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -237,7 +237,8 @@ static void execlists_init_reg_state(u32 *reg_state,
                                      bool close);
  static void
  __execlists_update_reg_state(const struct intel_context *ce,
-                            const struct intel_engine_cs *engine);
+                            const struct intel_engine_cs *engine,
+                            u32 head);
  
  static void mark_eio(struct i915_request *rq)
  {
@@ -985,6 +986,8 @@ __unwind_incomplete_requests(struct intel_engine_cs *engine)
                         GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
  
                         list_move(&rq->sched.link, pl);
+                       set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+
                         active = rq;
                 } else {
                         struct intel_engine_cs *owner = rq->context->engine;
@@ -1184,12 +1187,11 @@ static void reset_active(struct i915_request *rq,
                 head = rq->tail;
         else
                 head = active_request(ce->timeline, rq)->head;
-       ce->ring->head = intel_ring_wrap(ce->ring, head);
-       intel_ring_update_space(ce->ring);
+       head = intel_ring_wrap(ce->ring, head);
  
         /* Scrub the context image to prevent replaying the previous batch */
         restore_default_state(ce, engine);
-       __execlists_update_reg_state(ce, engine);
+       __execlists_update_reg_state(ce, engine, head);
  
         /* We've switched away, so this should be a no-op, but intent matters */
         ce->lrc_desc |= CTX_DESC_FORCE_RESTORE;
@@ -1319,7 +1321,7 @@ static u64 execlists_update_context(struct i915_request *rq)
  {
         struct intel_context *ce = rq->context;
         u64 desc = ce->lrc_desc;
-       u32 tail;
+       u32 tail, prev;
  
         /*
          * WaIdleLiteRestore:bdw,skl
@@ -1332,9 +1334,15 @@ static u64 execlists_update_context(struct i915_request *rq)
          * subsequent resubmissions (for lite restore). Should that fail us,
          * and we try and submit the same tail again, force the context
          * reload.
+        *
+        * If we need to return to a preempted context, we need to skip the
+        * lite-restore and force it to reload the RING_TAIL. Otherwise, the
+        * HW has a tendency to ignore us rewinding the TAIL to the end of
+        * an earlier request.
          */
         tail = intel_ring_set_tail(rq->ring, rq->tail);
-       if (unlikely(ce->lrc_reg_state[CTX_RING_TAIL] == tail))
+       prev = ce->lrc_reg_state[CTX_RING_TAIL];
+       if (unlikely(intel_ring_direction(rq->ring, tail, prev) <= 0))
                 desc |= CTX_DESC_FORCE_RESTORE;
         ce->lrc_reg_state[CTX_RING_TAIL] = tail;
         rq->tail = rq->wa_tail;
@@ -1535,7 +1543,8 @@ static bool can_merge_rq(const struct i915_request *prev,
                 return true;
  
         if (unlikely((prev->fence.flags ^ next->fence.flags) &
-                    (I915_FENCE_FLAG_NOPREEMPT | I915_FENCE_FLAG_SENTINEL)))
+                    (BIT(I915_FENCE_FLAG_NOPREEMPT) |
+                     BIT(I915_FENCE_FLAG_SENTINEL))))
                 return false;
  
         if (!can_merge_ctx(prev->context, next->context))
@@ -1602,6 +1611,11 @@ last_active(const struct intel_engine_execlists *execlists)
         return *last;
  }
  
+#define for_each_waiter(p__, rq__) \
+       list_for_each_entry_lockless(p__, \
+                                    &(rq__)->sched.waiters_list, \
+                                    wait_link)
+
  static void defer_request(struct i915_request *rq, struct list_head * const pl)
  {
         LIST_HEAD(list);
@@ -1619,7 +1633,7 @@ static void defer_request(struct i915_request *rq, struct list_head * const pl)
                 GEM_BUG_ON(i915_request_is_active(rq));
                 list_move_tail(&rq->sched.link, pl);
  
-               list_for_each_entry(p, &rq->sched.waiters_list, wait_link) {
+               for_each_waiter(p, rq) {
                         struct i915_request *w =
                                 container_of(p->waiter, typeof(*w), sched);
  
@@ -1632,8 +1646,8 @@ static void defer_request(struct i915_request *rq, struct list_head * const pl)
                                    !i915_request_completed(rq));
  
                         GEM_BUG_ON(i915_request_is_active(w));
-                       if (list_empty(&w->sched.link))
-                               continue; /* Not yet submitted; unready */
+                       if (!i915_request_is_ready(w))
+                               continue;
  
                         if (rq_prio(w) < rq_prio(rq))
                                 continue;
@@ -1831,14 +1845,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
                          */
                         __unwind_incomplete_requests(engine);
  
-                       /*
-                        * If we need to return to the preempted context, we
-                        * need to skip the lite-restore and force it to
-                        * reload the RING_TAIL. Otherwise, the HW has a
-                        * tendency to ignore us rewinding the TAIL to the
-                        * end of an earlier request.
-                        */
-                       last->context->lrc_desc |= CTX_DESC_FORCE_RESTORE;
                         last = NULL;
                 } else if (need_timeslice(engine, last) &&
                            timer_expired(&engine->execlists.timer)) {
@@ -2351,6 +2357,310 @@ static void __execlists_submission_tasklet(struct intel_engine_cs *const engine)
         }
  }
  
+static void __execlists_hold(struct i915_request *rq)
+{
+       LIST_HEAD(list);
+
+       do {
+               struct i915_dependency *p;
+
+               if (i915_request_is_active(rq))
+                       __i915_request_unsubmit(rq);
+
+               RQ_TRACE(rq, "on hold\n");
+               clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+               list_move_tail(&rq->sched.link, &rq->engine->active.hold);
+               i915_request_set_hold(rq);
+
+               list_for_each_entry(p, &rq->sched.waiters_list, wait_link) {
+                       struct i915_request *w =
+                               container_of(p->waiter, typeof(*w), sched);
+
+                       /* Leave semaphores spinning on the other engines */
+                       if (w->engine != rq->engine)
+                               continue;
+
+                       if (!i915_request_is_ready(w))
+                               continue;
+
+                       if (i915_request_completed(w))
+                               continue;
+
+                       if (i915_request_on_hold(rq))
+                               continue;
+
+                       list_move_tail(&w->sched.link, &list);
+               }
+
+               rq = list_first_entry_or_null(&list, typeof(*rq), sched.link);
+       } while (rq);
+}
+
+static bool execlists_hold(struct intel_engine_cs *engine,
+                          struct i915_request *rq)
+{
+       spin_lock_irq(&engine->active.lock);
+
+       if (i915_request_completed(rq)) { /* too late! */
+               rq = NULL;
+               goto unlock;
+       }
+
+       if (rq->engine != engine) { /* preempted virtual engine */
+               struct virtual_engine *ve = to_virtual_engine(rq->engine);
+
+               /*
+                * intel_context_inflight() is only protected by virtue
+                * of process_csb() being called only by the tasklet (or
+                * directly from inside reset while the tasklet is suspended).
+                * Assert that neither of those are allowed to run while we
+                * poke at the request queues.
+                */
+               GEM_BUG_ON(!reset_in_progress(&engine->execlists));
+
+               /*
+                * An unsubmitted request along a virtual engine will
+                * remain on the active (this) engine until we are able
+                * to process the context switch away (and so mark the
+                * context as no longer in flight). That cannot have happened
+                * yet, otherwise we would not be hanging!
+                */
+               spin_lock(&ve->base.active.lock);
+               GEM_BUG_ON(intel_context_inflight(rq->context) != engine);
+               GEM_BUG_ON(ve->request != rq);
+               ve->request = NULL;
+               spin_unlock(&ve->base.active.lock);
+               i915_request_put(rq);
+
+               rq->engine = engine;
+       }
+
+       /*
+        * Transfer this request onto the hold queue to prevent it
+        * being resumbitted to HW (and potentially completed) before we have
+        * released it. Since we may have already submitted following
+        * requests, we need to remove those as well.
+        */
+       GEM_BUG_ON(i915_request_on_hold(rq));
+       GEM_BUG_ON(rq->engine != engine);
+       __execlists_hold(rq);
+
+unlock:
+       spin_unlock_irq(&engine->active.lock);
+       return rq;
+}
+
+static bool hold_request(const struct i915_request *rq)
+{
+       struct i915_dependency *p;
+
+       /*
+        * If one of our ancestors is on hold, we must also be on hold,
+        * otherwise we will bypass it and execute before it.
+        */
+       list_for_each_entry(p, &rq->sched.signalers_list, signal_link) {
+               const struct i915_request *s =
+                       container_of(p->signaler, typeof(*s), sched);
+
+               if (s->engine != rq->engine)
+                       continue;
+
+               if (i915_request_on_hold(s))
+                       return true;
+       }
+
+       return false;
+}
+
+static void __execlists_unhold(struct i915_request *rq)
+{
+       LIST_HEAD(list);
+
+       do {
+               struct i915_dependency *p;
+
+               GEM_BUG_ON(!i915_request_on_hold(rq));
+               GEM_BUG_ON(!i915_sw_fence_signaled(&rq->submit));
+
+               i915_request_clear_hold(rq);
+               list_move_tail(&rq->sched.link,
+                              i915_sched_lookup_priolist(rq->engine,
+                                                         rq_prio(rq)));
+               set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+               RQ_TRACE(rq, "hold release\n");
+
+               /* Also release any children on this engine that are ready */
+               list_for_each_entry(p, &rq->sched.waiters_list, wait_link) {
+                       struct i915_request *w =
+                               container_of(p->waiter, typeof(*w), sched);
+
+                       if (w->engine != rq->engine)
+                               continue;
+
+                       if (!i915_request_on_hold(rq))
+                               continue;
+
+                       /* Check that no other parents are also on hold */
+                       if (hold_request(rq))
+                               continue;
+
+                       list_move_tail(&w->sched.link, &list);
+               }
+
+               rq = list_first_entry_or_null(&list, typeof(*rq), sched.link);
+       } while (rq);
+}
+
+static void execlists_unhold(struct intel_engine_cs *engine,
+                            struct i915_request *rq)
+{
+       spin_lock_irq(&engine->active.lock);
+
+       /*
+        * Move this request back to the priority queue, and all of its
+        * children and grandchildren that were suspended along with it.
+        */
+       __execlists_unhold(rq);
+
+       if (rq_prio(rq) > engine->execlists.queue_priority_hint) {
+               engine->execlists.queue_priority_hint = rq_prio(rq);
+               tasklet_hi_schedule(&engine->execlists.tasklet);
+       }
+
+       spin_unlock_irq(&engine->active.lock);
+}
+
+struct execlists_capture {
+       struct work_struct work;
+       struct i915_request *rq;
+       struct i915_gpu_coredump *error;
+};
+
+static void execlists_capture_work(struct work_struct *work)
+{
+       struct execlists_capture *cap = container_of(work, typeof(*cap), work);
+       const gfp_t gfp = GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN;
+       struct intel_engine_cs *engine = cap->rq->engine;
+       struct intel_gt_coredump *gt = cap->error->gt;
+       struct intel_engine_capture_vma *vma;
+
+       /* Compress all the objects attached to the request, slow! */
+       vma = intel_engine_coredump_add_request(gt->engine, cap->rq, gfp);
+       if (vma) {
+               struct i915_vma_compress *compress =
+                       i915_vma_capture_prepare(gt);
+
+               intel_engine_coredump_add_vma(gt->engine, vma, compress);
+               i915_vma_capture_finish(gt, compress);
+       }
+
+       gt->simulated = gt->engine->simulated;
+       cap->error->simulated = gt->simulated;
+
+       /* Publish the error state, and announce it to the world */
+       i915_error_state_store(cap->error);
+       i915_gpu_coredump_put(cap->error);
+
+       /* Return this request and all that depend upon it for signaling */
+       execlists_unhold(engine, cap->rq);
+       i915_request_put(cap->rq);
+
+       kfree(cap);
+}
+
+static struct execlists_capture *capture_regs(struct intel_engine_cs *engine)
+{
+       const gfp_t gfp = GFP_ATOMIC | __GFP_NOWARN;
+       struct execlists_capture *cap;
+
+       cap = kmalloc(sizeof(*cap), gfp);
+       if (!cap)
+               return NULL;
+
+       cap->error = i915_gpu_coredump_alloc(engine->i915, gfp);
+       if (!cap->error)
+               goto err_cap;
+
+       cap->error->gt = intel_gt_coredump_alloc(engine->gt, gfp);
+       if (!cap->error->gt)
+               goto err_gpu;
+
+       cap->error->gt->engine = intel_engine_coredump_alloc(engine, gfp);
+       if (!cap->error->gt->engine)
+               goto err_gt;
+
+       return cap;
+
+err_gt:
+       kfree(cap->error->gt);
+err_gpu:
+       kfree(cap->error);
+err_cap:
+       kfree(cap);
+       return NULL;
+}
+
+static bool execlists_capture(struct intel_engine_cs *engine)
+{
+       struct execlists_capture *cap;
+
+       if (!IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR))
+               return true;
+
+       /*
+        * We need to _quickly_ capture the engine state before we reset.
+        * We are inside an atomic section (softirq) here and we are delaying
+        * the forced preemption event.
+        */
+       cap = capture_regs(engine);
+       if (!cap)
+               return true;
+
+       cap->rq = execlists_active(&engine->execlists);
+       GEM_BUG_ON(!cap->rq);
+
+       rcu_read_lock();
+       cap->rq = active_request(cap->rq->context->timeline, cap->rq);
+       cap->rq = i915_request_get_rcu(cap->rq);
+       rcu_read_unlock();
+       if (!cap->rq)
+               goto err_free;
+
+       /*
+        * Remove the request from the execlists queue, and take ownership
+        * of the request. We pass it to our worker who will _slowly_ compress
+        * all the pages the _user_ requested for debugging their batch, after
+        * which we return it to the queue for signaling.
+        *
+        * By removing them from the execlists queue, we also remove the
+        * requests from being processed by __unwind_incomplete_requests()
+        * during the intel_engine_reset(), and so they will *not* be replayed
+        * afterwards.
+        *
+        * Note that because we have not yet reset the engine at this point,
+        * it is possible for the request that we have identified as being
+        * guilty, did in fact complete and we will then hit an arbitration
+        * point allowing the outstanding preemption to succeed. The likelihood
+        * of that is very low (as capturing of the engine registers should be
+        * fast enough to run inside an irq-off atomic section!), so we will
+        * simply hold that request accountable for being non-preemptible
+        * long enough to force the reset.
+        */
+       if (!execlists_hold(engine, cap->rq))
+               goto err_rq;
+
+       INIT_WORK(&cap->work, execlists_capture_work);
+       schedule_work(&cap->work);
+       return true;
+
+err_rq:
+       i915_request_put(cap->rq);
+err_free:
+       i915_gpu_coredump_put(cap->error);
+       kfree(cap);
+       return false;
+}
+
  static noinline void preempt_reset(struct intel_engine_cs *engine)
  {
         const unsigned int bit = I915_RESET_ENGINE + engine->id;
@@ -2368,7 +2678,12 @@ static noinline void preempt_reset(struct intel_engine_cs *engine)
         ENGINE_TRACE(engine, "preempt timeout %lu+%ums\n",
                      READ_ONCE(engine->props.preempt_timeout_ms),
                      jiffies_to_msecs(jiffies - engine->execlists.preempt.expires));
-       intel_engine_reset(engine, "preemption time out");
+
+       ring_set_paused(engine, 1); /* Freeze the current request in place */
+       if (execlists_capture(engine))
+               intel_engine_reset(engine, "preemption time out");
+       else
+               ring_set_paused(engine, 0);
  
         tasklet_enable(&engine->execlists.tasklet);
         clear_and_wake_up_bit(bit, lock);
@@ -2430,11 +2745,12 @@ static void execlists_preempt(struct timer_list *timer)
  }
  
  static void queue_request(struct intel_engine_cs *engine,
-                         struct i915_sched_node *node,
-                         int prio)
+                         struct i915_request *rq)
  {
-       GEM_BUG_ON(!list_empty(&node->link));
-       list_add_tail(&node->link, i915_sched_lookup_priolist(engine, prio));
+       GEM_BUG_ON(!list_empty(&rq->sched.link));
+       list_add_tail(&rq->sched.link,
+                     i915_sched_lookup_priolist(engine, rq_prio(rq)));
+       set_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
  }
  
  static void __submit_queue_imm(struct intel_engine_cs *engine)
@@ -2462,6 +2778,13 @@ static void submit_queue(struct intel_engine_cs *engine,
         __submit_queue_imm(engine);
  }
  
+static bool ancestor_on_hold(const struct intel_engine_cs *engine,
+                            const struct i915_request *rq)
+{
+       GEM_BUG_ON(i915_request_on_hold(rq));
+       return !list_empty(&engine->active.hold) && hold_request(rq);
+}
+
  static void execlists_submit_request(struct i915_request *request)
  {
         struct intel_engine_cs *engine = request->engine;
@@ -2470,12 +2793,17 @@ static void execlists_submit_request(struct i915_request *request)
         /* Will be called from irq-context when using foreign fences. */
         spin_lock_irqsave(&engine->active.lock, flags);
  
-       queue_request(engine, &request->sched, rq_prio(request));
+       if (unlikely(ancestor_on_hold(engine, request))) {
+               list_add_tail(&request->sched.link, &engine->active.hold);
+               i915_request_set_hold(request);
+       } else {
+               queue_request(engine, request);
  
-       GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
-       GEM_BUG_ON(list_empty(&request->sched.link));
+               GEM_BUG_ON(RB_EMPTY_ROOT(&engine->execlists.queue.rb_root));
+               GEM_BUG_ON(list_empty(&request->sched.link));
  
-       submit_queue(engine, request);
+               submit_queue(engine, request);
+       }
  
         spin_unlock_irqrestore(&engine->active.lock, flags);
  }
@@ -2531,21 +2859,21 @@ static void execlists_context_unpin(struct intel_context *ce)
                       ce->engine);
  
         i915_gem_object_unpin_map(ce->state->obj);
-       intel_ring_reset(ce->ring, ce->ring->tail);
  }
  
  static void
  __execlists_update_reg_state(const struct intel_context *ce,
-                            const struct intel_engine_cs *engine)
+                            const struct intel_engine_cs *engine,
+                            u32 head)
  {
         struct intel_ring *ring = ce->ring;
         u32 *regs = ce->lrc_reg_state;
  
-       GEM_BUG_ON(!intel_ring_offset_valid(ring, ring->head));
+       GEM_BUG_ON(!intel_ring_offset_valid(ring, head));
         GEM_BUG_ON(!intel_ring_offset_valid(ring, ring->tail));
  
         regs[CTX_RING_START] = i915_ggtt_offset(ring->vma);
-       regs[CTX_RING_HEAD] = ring->head;
+       regs[CTX_RING_HEAD] = head;
         regs[CTX_RING_TAIL] = ring->tail;
  
         /* RPCS */
@@ -2574,7 +2902,7 @@ __execlists_context_pin(struct intel_context *ce,
  
         ce->lrc_desc = lrc_descriptor(ce, engine) | CTX_DESC_FORCE_RESTORE;
         ce->lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE;
-       __execlists_update_reg_state(ce, engine);
+       __execlists_update_reg_state(ce, engine, ce->ring->tail);
  
         return 0;
  }
@@ -2615,7 +2943,7 @@ static void execlists_context_reset(struct intel_context *ce)
         /* Scrub away the garbage */
         execlists_init_reg_state(ce->lrc_reg_state,
                                  ce, ce->engine, ce->ring, true);
-       __execlists_update_reg_state(ce, ce->engine);
+       __execlists_update_reg_state(ce, ce->engine, ce->ring->tail);
  
         ce->lrc_desc |= CTX_DESC_FORCE_RESTORE;
  }
@@ -3170,6 +3498,7 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
         struct intel_engine_execlists * const execlists = &engine->execlists;
         struct intel_context *ce;
         struct i915_request *rq;
+       u32 head;
  
         mb(); /* paranoia: read the CSB pointers from after the reset */
         clflush(execlists->csb_write);
@@ -3197,15 +3526,15 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
  
         if (i915_request_completed(rq)) {
                 /* Idle context; tidy up the ring so we can restart afresh */
-               ce->ring->head = intel_ring_wrap(ce->ring, rq->tail);
+               head = intel_ring_wrap(ce->ring, rq->tail);
                 goto out_replay;
         }
  
         /* Context has requests still in-flight; it should not be idle! */
         GEM_BUG_ON(i915_active_is_idle(&ce->active));
         rq = active_request(ce->timeline, rq);
-       ce->ring->head = intel_ring_wrap(ce->ring, rq->head);
-       GEM_BUG_ON(ce->ring->head == ce->ring->tail);
+       head = intel_ring_wrap(ce->ring, rq->head);
+       GEM_BUG_ON(head == ce->ring->tail);
  
         /*
          * If this request hasn't started yet, e.g. it is waiting on a
@@ -3250,10 +3579,9 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
  
  out_replay:
         ENGINE_TRACE(engine, "replay {head:%04x, tail:%04x}\n",
-                    ce->ring->head, ce->ring->tail);
-       intel_ring_update_space(ce->ring);
+                    head, ce->ring->tail);
         __execlists_reset_reg_state(ce, engine);
-       __execlists_update_reg_state(ce, engine);
+       __execlists_update_reg_state(ce, engine, head);
         ce->lrc_desc |= CTX_DESC_FORCE_RESTORE; /* paranoid: GPU was reset! */
  
  unwind:
@@ -3325,6 +3653,10 @@ static void execlists_reset_cancel(struct intel_engine_cs *engine)
                 i915_priolist_free(p);
         }
  
+       /* On-hold requests will be flushed to timeline upon their release */
+       list_for_each_entry(rq, &engine->active.hold, sched.link)
+               mark_eio(rq);
+
         /* Cancel all attached virtual engines */
         while ((rb = rb_first_cached(&execlists->virtual))) {
                 struct virtual_engine *ve =
@@ -4892,10 +5224,7 @@ void intel_lr_context_reset(struct intel_engine_cs *engine,
                 restore_default_state(ce, engine);
  
         /* Rerun the request; its payload has been neutered (if guilty). */
-       ce->ring->head = head;
-       intel_ring_update_space(ce->ring);
-
-       __execlists_update_reg_state(ce, engine);
+       __execlists_update_reg_state(ce, engine, head);
  }
  
  bool
diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c

index 374b28f13ca0b4565adc32cd621ce0d69ad81d2c..6ff803f397c4d7015b6bfdb84785f5c6b355b0ea 100644 (file)
--- a/drivers/gpu/drm/i915/gt/intel_ring.c
+++ b/drivers/gpu/drm/i915/gt/intel_ring.c
@@ -145,6 +145,7 @@ intel_engine_create_ring(struct intel_engine_cs *engine, int size)
  
         kref_init(&ring->ref);
         ring->size = size;
+       ring->wrap = BITS_PER_TYPE(ring->size) - ilog2(size);
  
         /*
          * Workaround an erratum on the i830 which causes a hang if
diff --git a/drivers/gpu/drm/i915/gt/intel_ring.h b/drivers/gpu/drm/i915/gt/intel_ring.h

index ea2839d9e044546957ea19133139c025375d9055..5bdce24994aa04ef80306a43e180c3599d81ee7c 100644 (file)
--- a/drivers/gpu/drm/i915/gt/intel_ring.h
+++ b/drivers/gpu/drm/i915/gt/intel_ring.h
@@ -56,6 +56,14 @@ static inline u32 intel_ring_wrap(const struct intel_ring *ring, u32 pos)
         return pos & (ring->size - 1);
  }
  
+static inline int intel_ring_direction(const struct intel_ring *ring,
+                                      u32 next, u32 prev)
+{
+       typecheck(typeof(ring->size), next);
+       typecheck(typeof(ring->size), prev);
+       return (next - prev) << ring->wrap;
+}
+
  static inline bool
  intel_ring_offset_valid(const struct intel_ring *ring,
                         unsigned int pos)
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_types.h b/drivers/gpu/drm/i915/gt/intel_ring_types.h

index d9f17f38e0cce9e2a05adf2fe7a1efc00b3c7709..1a189ea00fd8240bbdf4fcdd45822f7e6240d787 100644 (file)
--- a/drivers/gpu/drm/i915/gt/intel_ring_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_ring_types.h
@@ -39,12 +39,13 @@ struct intel_ring {
          */
         atomic_t pin_count;
  
-       u32 head;
-       u32 tail;
-       u32 emit;
+       u32 head; /* updated during retire, loosely tracks RING_HEAD */
+       u32 tail; /* updated on submission, used for RING_TAIL */
+       u32 emit; /* updated during request construction */
  
         u32 space;
         u32 size;
+       u32 wrap;
         u32 effective_size;
  };
  
diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c

index a560b7eee2cd087b45e0e023f84eb971f594b902..f2806381733f55eacef5e5402711838c4682dfae 100644 (file)
--- a/drivers/gpu/drm/i915/gt/mock_engine.c
+++ b/drivers/gpu/drm/i915/gt/mock_engine.c
@@ -59,11 +59,26 @@ static struct intel_ring *mock_ring(struct intel_engine_cs *engine)
         ring->vaddr = (void *)(ring + 1);
         atomic_set(&ring->pin_count, 1);
  
+       ring->vma = i915_vma_alloc();
+       if (!ring->vma) {
+               kfree(ring);
+               return NULL;
+       }
+       i915_active_init(&ring->vma->active, NULL, NULL);
+
         intel_ring_update_space(ring);
  
         return ring;
  }
  
+static void mock_ring_free(struct intel_ring *ring)
+{
+       i915_active_fini(&ring->vma->active);
+       i915_vma_free(ring->vma);
+
+       kfree(ring);
+}
+
  static struct i915_request *first_request(struct mock_engine *engine)
  {
         return list_first_entry_or_null(&engine->hw_queue,
@@ -121,7 +136,7 @@ static void mock_context_destroy(struct kref *ref)
         GEM_BUG_ON(intel_context_is_pinned(ce));
  
         if (test_bit(CONTEXT_ALLOC_BIT, &ce->flags)) {
-               kfree(ce->ring);
+               mock_ring_free(ce->ring);
                 mock_timeline_unpin(ce->timeline);
         }
  
diff --git a/drivers/gpu/drm/i915/gt/selftest_lrc.c b/drivers/gpu/drm/i915/gt/selftest_lrc.c

index 15cda024e3e4577424a4c8580b03bfee2f3027d3..b292f8cbd0bf15eaa8a2f47b8b159069b852c40a 100644 (file)
--- a/drivers/gpu/drm/i915/gt/selftest_lrc.c
+++ b/drivers/gpu/drm/i915/gt/selftest_lrc.c
@@ -186,7 +186,7 @@ static int live_unlite_restore(struct intel_gt *gt, int prio)
                 }
                 GEM_BUG_ON(!ce[1]->ring->size);
                 intel_ring_reset(ce[1]->ring, ce[1]->ring->size / 2);
-               __execlists_update_reg_state(ce[1], engine);
+               __execlists_update_reg_state(ce[1], engine, ce[1]->ring->head);
  
                 rq[0] = igt_spinner_create_request(&spin, ce[0], MI_ARB_CHECK);
                 if (IS_ERR(rq[0])) {
@@ -285,6 +285,107 @@ static int live_unlite_preempt(void *arg)
         return live_unlite_restore(arg, I915_USER_PRIORITY(I915_PRIORITY_MAX));
  }
  
+static int live_hold_reset(void *arg)
+{
+       struct intel_gt *gt = arg;
+       struct intel_engine_cs *engine;
+       enum intel_engine_id id;
+       struct igt_spinner spin;
+       int err = 0;
+
+       /*
+        * In order to support offline error capture for fast preempt reset,
+        * we need to decouple the guilty request and ensure that it and its
+        * descendents are not executed while the capture is in progress.
+        */
+
+       if (!intel_has_reset_engine(gt))
+               return 0;
+
+       if (igt_spinner_init(&spin, gt))
+               return -ENOMEM;
+
+       for_each_engine(engine, gt, id) {
+               struct intel_context *ce;
+               unsigned long heartbeat;
+               struct i915_request *rq;
+
+               ce = intel_context_create(engine);
+               if (IS_ERR(ce)) {
+                       err = PTR_ERR(ce);
+                       break;
+               }
+
+               engine_heartbeat_disable(engine, &heartbeat);
+
+               rq = igt_spinner_create_request(&spin, ce, MI_ARB_CHECK);
+               if (IS_ERR(rq)) {
+                       err = PTR_ERR(rq);
+                       goto out;
+               }
+               i915_request_add(rq);
+
+               if (!igt_wait_for_spinner(&spin, rq)) {
+                       intel_gt_set_wedged(gt);
+                       err = -ETIME;
+                       goto out;
+               }
+
+               /* We have our request executing, now remove it and reset */
+
+               if (test_and_set_bit(I915_RESET_ENGINE + id,
+                                    &gt->reset.flags)) {
+                       intel_gt_set_wedged(gt);
+                       err = -EBUSY;
+                       goto out;
+               }
+               tasklet_disable(&engine->execlists.tasklet);
+
+               engine->execlists.tasklet.func(engine->execlists.tasklet.data);
+               GEM_BUG_ON(execlists_active(&engine->execlists) != rq);
+
+               i915_request_get(rq);
+               execlists_hold(engine, rq);
+               GEM_BUG_ON(!i915_request_on_hold(rq));
+
+               intel_engine_reset(engine, NULL);
+               GEM_BUG_ON(rq->fence.error != -EIO);
+
+               tasklet_enable(&engine->execlists.tasklet);
+               clear_and_wake_up_bit(I915_RESET_ENGINE + id,
+                                     &gt->reset.flags);
+
+               /* Check that we do not resubmit the held request */
+               if (!i915_request_wait(rq, 0, HZ / 5)) {
+                       pr_err("%s: on hold request completed!\n",
+                              engine->name);
+                       i915_request_put(rq);
+                       err = -EIO;
+                       goto out;
+               }
+               GEM_BUG_ON(!i915_request_on_hold(rq));
+
+               /* But is resubmitted on release */
+               execlists_unhold(engine, rq);
+               if (i915_request_wait(rq, 0, HZ / 5) < 0) {
+                       pr_err("%s: held request did not complete!\n",
+                              engine->name);
+                       intel_gt_set_wedged(gt);
+                       err = -ETIME;
+               }
+               i915_request_put(rq);
+
+out:
+               engine_heartbeat_enable(engine, heartbeat);
+               intel_context_put(ce);
+               if (err)
+                       break;
+       }
+
+       igt_spinner_fini(&spin);
+       return err;
+}
+
  static int
  emit_semaphore_chain(struct i915_request *rq, struct i915_vma *vma, int idx)
  {
@@ -3309,12 +3410,168 @@ static int live_virtual_bond(void *arg)
         return 0;
  }
  
+static int reset_virtual_engine(struct intel_gt *gt,
+                               struct intel_engine_cs **siblings,
+                               unsigned int nsibling)
+{
+       struct intel_engine_cs *engine;
+       struct intel_context *ve;
+       unsigned long *heartbeat;
+       struct igt_spinner spin;
+       struct i915_request *rq;
+       unsigned int n;
+       int err = 0;
+
+       /*
+        * In order to support offline error capture for fast preempt reset,
+        * we need to decouple the guilty request and ensure that it and its
+        * descendents are not executed while the capture is in progress.
+        */
+
+       heartbeat = kmalloc_array(nsibling, sizeof(*heartbeat), GFP_KERNEL);
+       if (!heartbeat)
+               return -ENOMEM;
+
+       if (igt_spinner_init(&spin, gt)) {
+               err = -ENOMEM;
+               goto out_free;
+       }
+
+       ve = intel_execlists_create_virtual(siblings, nsibling);
+       if (IS_ERR(ve)) {
+               err = PTR_ERR(ve);
+               goto out_spin;
+       }
+
+       for (n = 0; n < nsibling; n++)
+               engine_heartbeat_disable(siblings[n], &heartbeat[n]);
+
+       rq = igt_spinner_create_request(&spin, ve, MI_ARB_CHECK);
+       if (IS_ERR(rq)) {
+               err = PTR_ERR(rq);
+               goto out_heartbeat;
+       }
+       i915_request_add(rq);
+
+       if (!igt_wait_for_spinner(&spin, rq)) {
+               intel_gt_set_wedged(gt);
+               err = -ETIME;
+               goto out_heartbeat;
+       }
+
+       engine = rq->engine;
+       GEM_BUG_ON(engine == ve->engine);
+
+       /* Take ownership of the reset and tasklet */
+       if (test_and_set_bit(I915_RESET_ENGINE + engine->id,
+                            &gt->reset.flags)) {
+               intel_gt_set_wedged(gt);
+               err = -EBUSY;
+               goto out_heartbeat;
+       }
+       tasklet_disable(&engine->execlists.tasklet);
+
+       engine->execlists.tasklet.func(engine->execlists.tasklet.data);
+       GEM_BUG_ON(execlists_active(&engine->execlists) != rq);
+
+       /* Fake a preemption event; failed of course */
+       spin_lock_irq(&engine->active.lock);
+       __unwind_incomplete_requests(engine);
+       spin_unlock_irq(&engine->active.lock);
+       GEM_BUG_ON(rq->engine != ve->engine);
+
+       /* Reset the engine while keeping our active request on hold */
+       execlists_hold(engine, rq);
+       GEM_BUG_ON(!i915_request_on_hold(rq));
+
+       intel_engine_reset(engine, NULL);
+       GEM_BUG_ON(rq->fence.error != -EIO);
+
+       /* Release our grasp on the engine, letting CS flow again */
+       tasklet_enable(&engine->execlists.tasklet);
+       clear_and_wake_up_bit(I915_RESET_ENGINE + engine->id, &gt->reset.flags);
+
+       /* Check that we do not resubmit the held request */
+       i915_request_get(rq);
+       if (!i915_request_wait(rq, 0, HZ / 5)) {
+               pr_err("%s: on hold request completed!\n",
+                      engine->name);
+               intel_gt_set_wedged(gt);
+               err = -EIO;
+               goto out_rq;
+       }
+       GEM_BUG_ON(!i915_request_on_hold(rq));
+
+       /* But is resubmitted on release */
+       execlists_unhold(engine, rq);
+       if (i915_request_wait(rq, 0, HZ / 5) < 0) {
+               pr_err("%s: held request did not complete!\n",
+                      engine->name);
+               intel_gt_set_wedged(gt);
+               err = -ETIME;
+       }
+
+out_rq:
+       i915_request_put(rq);
+out_heartbeat:
+       for (n = 0; n < nsibling; n++)
+               engine_heartbeat_enable(siblings[n], heartbeat[n]);
+
+       intel_context_put(ve);
+out_spin:
+       igt_spinner_fini(&spin);
+out_free:
+       kfree(heartbeat);
+       return err;
+}
+
+static int live_virtual_reset(void *arg)
+{
+       struct intel_gt *gt = arg;
+       struct intel_engine_cs *siblings[MAX_ENGINE_INSTANCE + 1];
+       unsigned int class, inst;
+
+       /*
+        * Check that we handle a reset event within a virtual engine.
+        * Only the physical engine is reset, but we have to check the flow
+        * of the virtual requests around the reset, and make sure it is not
+        * forgotten.
+        */
+
+       if (USES_GUC_SUBMISSION(gt->i915))
+               return 0;
+
+       if (!intel_has_reset_engine(gt))
+               return 0;
+
+       for (class = 0; class <= MAX_ENGINE_CLASS; class++) {
+               int nsibling, err;
+
+               nsibling = 0;
+               for (inst = 0; inst <= MAX_ENGINE_INSTANCE; inst++) {
+                       if (!gt->engine_class[class][inst])
+                               continue;
+
+                       siblings[nsibling++] = gt->engine_class[class][inst];
+               }
+               if (nsibling < 2)
+                       continue;
+
+               err = reset_virtual_engine(gt, siblings, nsibling);
+               if (err)
+                       return err;
+       }
+
+       return 0;
+}
+
  int intel_execlists_live_selftests(struct drm_i915_private *i915)
  {
         static const struct i915_subtest tests[] = {
                 SUBTEST(live_sanitycheck),
                 SUBTEST(live_unlite_switch),
                 SUBTEST(live_unlite_preempt),
+               SUBTEST(live_hold_reset),
                 SUBTEST(live_timeslice_preempt),
                 SUBTEST(live_timeslice_queue),
                 SUBTEST(live_busywait_preempt),
@@ -3333,6 +3590,7 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
                 SUBTEST(live_virtual_mask),
                 SUBTEST(live_virtual_preserved),
                 SUBTEST(live_virtual_bond),
+               SUBTEST(live_virtual_reset),
         };
  
         if (!HAS_EXECLISTS(i915))
diff --git a/drivers/gpu/drm/i915/gvt/dmabuf.c b/drivers/gpu/drm/i915/gvt/dmabuf.c

index 2477a1e5a1669cf220ccb85b062885714c100d3d..ae139f0877aeb72a098657ea296f876f6b8194a0 100644 (file)
--- a/drivers/gpu/drm/i915/gvt/dmabuf.c
+++ b/drivers/gpu/drm/i915/gvt/dmabuf.c
@@ -151,12 +151,12 @@ static void dmabuf_gem_object_free(struct kref *kref)
                         dmabuf_obj = container_of(pos,
                                         struct intel_vgpu_dmabuf_obj, list);
                         if (dmabuf_obj == obj) {
+                               list_del(pos);
                                 intel_gvt_hypervisor_put_vfio_device(vgpu);
                                 idr_remove(&vgpu->object_idr,
                                            dmabuf_obj->dmabuf_id);
                                 kfree(dmabuf_obj->info);
                                 kfree(dmabuf_obj);
-                               list_del(pos);
                                 break;
                         }
                 }
diff --git a/drivers/gpu/drm/i915/gvt/firmware.c b/drivers/gpu/drm/i915/gvt/firmware.c

index 049775e8e350eaf4179892bd8fe989dc790a79c0..b0c1fda32977ce26240954935767e4edfc51b7b9 100644 (file)
--- a/drivers/gpu/drm/i915/gvt/firmware.c
+++ b/drivers/gpu/drm/i915/gvt/firmware.c
@@ -146,7 +146,7 @@ void intel_gvt_free_firmware(struct intel_gvt *gvt)
                 clean_firmware_sysfs(gvt);
  
         kfree(gvt->firmware.cfg_space);
-       kfree(gvt->firmware.mmio);
+       vfree(gvt->firmware.mmio);
  }
  
  static int verify_firmware(struct intel_gvt *gvt,
@@ -229,7 +229,7 @@ int intel_gvt_load_firmware(struct intel_gvt *gvt)
  
         firmware->cfg_space = mem;
  
-       mem = kmalloc(info->mmio_size, GFP_KERNEL);
+       mem = vmalloc(info->mmio_size);
         if (!mem) {
                 kfree(path);
                 kfree(firmware->cfg_space);
diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c

index 34cb404ba4b789ca97c86794d90c24edf3cf95ef..4a4828074cb708b09b764f295f28f80fa137d14c 100644 (file)
--- a/drivers/gpu/drm/i915/gvt/gtt.c
+++ b/drivers/gpu/drm/i915/gvt/gtt.c
@@ -1956,7 +1956,11 @@ void _intel_vgpu_mm_release(struct kref *mm_ref)
  
         if (mm->type == INTEL_GVT_MM_PPGTT) {
                 list_del(&mm->ppgtt_mm.list);
+
+               mutex_lock(&mm->vgpu->gvt->gtt.ppgtt_mm_lock);
                 list_del(&mm->ppgtt_mm.lru_list);
+               mutex_unlock(&mm->vgpu->gvt->gtt.ppgtt_mm_lock);
+
                 invalidate_ppgtt_mm(mm);
         } else {
                 vfree(mm->ggtt_mm.virtual_ggtt);
diff --git a/drivers/gpu/drm/i915/gvt/vgpu.c b/drivers/gpu/drm/i915/gvt/vgpu.c

index 85bd9bf4f6eee58b3c00e567a7c5286a017dccb9..487af6ea9972c44855fb1259935869ee541f2458 100644 (file)
--- a/drivers/gpu/drm/i915/gvt/vgpu.c
+++ b/drivers/gpu/drm/i915/gvt/vgpu.c
@@ -560,9 +560,9 @@ void intel_gvt_reset_vgpu_locked(struct intel_vgpu *vgpu, bool dmlr,
  
                 intel_vgpu_reset_mmio(vgpu, dmlr);
                 populate_pvinfo_page(vgpu);
-               intel_vgpu_reset_display(vgpu);
  
                 if (dmlr) {
+                       intel_vgpu_reset_display(vgpu);
                         intel_vgpu_reset_cfg_space(vgpu);
                         /* only reset the failsafe mode when dmlr reset */
                         vgpu->failsafe = false;
diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c

index f3da5c06f331a5cf7df35ac3c59a7a587ccbc81c..b0a499753526615c9175d6b276ecb7089df87c38 100644 (file)
--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -416,13 +416,15 @@ int i915_active_acquire(struct i915_active *ref)
         if (err)
                 return err;
  
-       if (!atomic_read(&ref->count) && ref->active)
-               err = ref->active(ref);
-       if (!err) {
-               spin_lock_irq(&ref->tree_lock); /* vs __active_retire() */
-               debug_active_activate(ref);
-               atomic_inc(&ref->count);
-               spin_unlock_irq(&ref->tree_lock);
+       if (likely(!i915_active_acquire_if_busy(ref))) {
+               if (ref->active)
+                       err = ref->active(ref);
+               if (!err) {
+                       spin_lock_irq(&ref->tree_lock); /* __active_retire() */
+                       debug_active_activate(ref);
+                       atomic_inc(&ref->count);
+                       spin_unlock_irq(&ref->tree_lock);
+               }
         }
  
         mutex_unlock(&ref->mutex);
@@ -605,7 +607,7 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref,
                                             struct intel_engine_cs *engine)
  {
         intel_engine_mask_t tmp, mask = engine->mask;
-       struct llist_node *pos = NULL, *next;
+       struct llist_node *first = NULL, *last = NULL;
         struct intel_gt *gt = engine->gt;
         int err;
  
@@ -623,6 +625,7 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref,
          */
         for_each_engine_masked(engine, gt, mask, tmp) {
                 u64 idx = engine->kernel_context->timeline->fence_context;
+               struct llist_node *prev = first;
                 struct active_node *node;
  
                 node = reuse_idle_barrier(ref, idx);
@@ -656,23 +659,23 @@ int i915_active_acquire_preallocate_barrier(struct i915_active *ref,
                 GEM_BUG_ON(rcu_access_pointer(node->base.fence) != ERR_PTR(-EAGAIN));
  
                 GEM_BUG_ON(barrier_to_engine(node) != engine);
-               next = barrier_to_ll(node);
-               next->next = pos;
-               if (!pos)
-                       pos = next;
+               first = barrier_to_ll(node);
+               first->next = prev;
+               if (!last)
+                       last = first;
                 intel_engine_pm_get(engine);
         }
  
         GEM_BUG_ON(!llist_empty(&ref->preallocated_barriers));
-       llist_add_batch(next, pos, &ref->preallocated_barriers);
+       llist_add_batch(first, last, &ref->preallocated_barriers);
  
         return 0;
  
  unwind:
-       while (pos) {
-               struct active_node *node = barrier_from_ll(pos);
+       while (first) {
+               struct active_node *node = barrier_from_ll(first);
  
-               pos = pos->next;
+               first = first->next;
  
                 atomic_dec(&ref->count);
                 intel_engine_pm_put(barrier_to_engine(node));
diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h

index b571f675c7956a2938bd16760fe9352fd4944024..51e1e854ca5575157d0b84fc6e9077642712a9e1 100644 (file)
--- a/drivers/gpu/drm/i915/i915_active.h
+++ b/drivers/gpu/drm/i915/i915_active.h
@@ -188,6 +188,12 @@ int i915_active_acquire(struct i915_active *ref);
  bool i915_active_acquire_if_busy(struct i915_active *ref);
  void i915_active_release(struct i915_active *ref);
  
+static inline void __i915_active_acquire(struct i915_active *ref)
+{
+       GEM_BUG_ON(!atomic_read(&ref->count));
+       atomic_inc(&ref->count);
+}
+
  static inline bool
  i915_active_is_idle(const struct i915_active *ref)
  {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c

index 94f993e4c12f5cb687513efee0a2822afd20d734..5f6e639528219378a7a1346c8b14ce7df9bf2da6 100644 (file)
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -180,7 +180,7 @@ i915_gem_phys_pwrite(struct drm_i915_gem_object *obj,
                      struct drm_i915_gem_pwrite *args,
                      struct drm_file *file)
  {
-       void *vaddr = obj->phys_handle->vaddr + args->offset;
+       void *vaddr = sg_page(obj->mm.pages->sgl) + args->offset;
         char __user *user_data = u64_to_user_ptr(args->data_ptr);
  
         /*
@@ -265,7 +265,10 @@ i915_gem_dumb_create(struct drm_file *file,
                                                     DRM_FORMAT_MOD_LINEAR))
                 args->pitch = ALIGN(args->pitch, 4096);
  
-       args->size = args->pitch * args->height;
+       if (args->pitch < args->width)
+               return -EINVAL;
+
+       args->size = mul_u32_u32(args->pitch, args->height);
  
         mem_type = INTEL_MEMORY_SYSTEM;
         if (HAS_LMEM(to_i915(dev)))
@@ -841,10 +844,10 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
                 ret = i915_gem_gtt_pwrite_fast(obj, args);
  
         if (ret == -EFAULT || ret == -ENOSPC) {
-               if (obj->phys_handle)
-                       ret = i915_gem_phys_pwrite(obj, args, file);
-               else
+               if (i915_gem_object_has_struct_page(obj))
                         ret = i915_gem_shmem_pwrite(obj, args);
+               else
+                       ret = i915_gem_phys_pwrite(obj, args, file);
         }
  
         i915_gem_object_unpin_pages(obj);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c

index 4c1836f0a9911bedb73ee91c583ddb2aaccea705..9e401a5fcae8c9e64f7c244d71df6a261de83e21 100644 (file)
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1681,7 +1681,7 @@ static const char *error_msg(struct i915_gpu_coredump *error)
                         "GPU HANG: ecode %d:%x:%08x",
                         INTEL_GEN(error->i915), engines,
                         generate_ecode(first));
-       if (first) {
+       if (first && first->context.pid) {
                 /* Just show the first executing process, more is confusing */
                 len += scnprintf(error->error_msg + len,
                                  sizeof(error->error_msg) - len,
@@ -1852,7 +1852,8 @@ void i915_error_state_store(struct i915_gpu_coredump *error)
         if (!xchg(&warned, true) &&
             ktime_get_real_seconds() - DRIVER_TIMESTAMP < DAY_AS_SECONDS(180)) {
                 pr_info("GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.\n");
-               pr_info("Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel\n");
+               pr_info("Please file a _new_ bug report at https://gitlab.freedesktop.org/drm/intel/issues/new.\n");
+               pr_info("Please see https://gitlab.freedesktop.org/drm/intel/-/wikis/How-to-file-i915-bugs for details.\n");
                 pr_info("drm/i915 developers can then reassign to the right component if it's not a kernel issue.\n");
                 pr_info("The GPU crash dump is required to analyze GPU hangs, so please always attach it.\n");
                 pr_info("GPU crash dump saved to /sys/class/drm/card%d/error\n",
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.h b/drivers/gpu/drm/i915/i915_gpu_error.h

index 9109004956bd1e6b8c773e91f47cd4764bff47c3..e4a6afed3bbf6bd4fde09afd9bf5617101f9be6c 100644 (file)
--- a/drivers/gpu/drm/i915/i915_gpu_error.h
+++ b/drivers/gpu/drm/i915/i915_gpu_error.h
@@ -314,8 +314,11 @@ i915_vma_capture_finish(struct intel_gt_coredump *gt,
  }
  
  static inline void
-i915_error_state_store(struct drm_i915_private *i915,
-                      struct i915_gpu_coredump *error)
+i915_error_state_store(struct i915_gpu_coredump *error)
+{
+}
+
+static inline void i915_gpu_coredump_put(struct i915_gpu_coredump *gpu)
  {
  }
  
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c

index 83f01401b8b5c5758dea58f7c8ed5dcf6c22c21c..f631f6d2112746d408af0945e4eedad3e03a037b 100644 (file)
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -437,7 +437,7 @@ static const struct intel_device_info snb_m_gt2_info = {
         .has_rc6 = 1, \
         .has_rc6p = 1, \
         .has_rps = true, \
-       .ppgtt_type = INTEL_PPGTT_FULL, \
+       .ppgtt_type = INTEL_PPGTT_ALIASING, \
         .ppgtt_size = 31, \
         IVB_PIPE_OFFSETS, \
         IVB_CURSOR_OFFSETS, \
@@ -494,7 +494,7 @@ static const struct intel_device_info vlv_info = {
         .has_rps = true,
         .display.has_gmch = 1,
         .display.has_hotplug = 1,
-       .ppgtt_type = INTEL_PPGTT_FULL,
+       .ppgtt_type = INTEL_PPGTT_ALIASING,
         .ppgtt_size = 31,
         .has_snoop = true,
         .has_coherent_ggtt = false,
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c

index 28a82c849bacbc2a43fc1fff4236b8981d349d1e..aa729d04abe2ecba8b74b10600a850d020dfbcd6 100644 (file)
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -637,8 +637,10 @@ static void i915_pmu_enable(struct perf_event *event)
                 container_of(event->pmu, typeof(*i915), pmu.base);
         unsigned int bit = event_enabled_bit(event);
         struct i915_pmu *pmu = &i915->pmu;
+       intel_wakeref_t wakeref;
         unsigned long flags;
  
+       wakeref = intel_runtime_pm_get(&i915->runtime_pm);
         spin_lock_irqsave(&pmu->lock, flags);
  
         /*
@@ -648,6 +650,14 @@ static void i915_pmu_enable(struct perf_event *event)
         BUILD_BUG_ON(ARRAY_SIZE(pmu->enable_count) != I915_PMU_MASK_BITS);
         GEM_BUG_ON(bit >= ARRAY_SIZE(pmu->enable_count));
         GEM_BUG_ON(pmu->enable_count[bit] == ~0);
+
+       if (pmu->enable_count[bit] == 0 &&
+           config_enabled_mask(I915_PMU_RC6_RESIDENCY) & BIT_ULL(bit)) {
+               pmu->sample[__I915_SAMPLE_RC6_LAST_REPORTED].cur = 0;
+               pmu->sample[__I915_SAMPLE_RC6].cur = __get_rc6(&i915->gt);
+               pmu->sleep_last = ktime_get();
+       }
+
         pmu->enable |= BIT_ULL(bit);
         pmu->enable_count[bit]++;
  
@@ -688,6 +698,8 @@ static void i915_pmu_enable(struct perf_event *event)
          * an existing non-zero value.
          */
         local64_set(&event->hw.prev_count, __i915_pmu_event_read(event));
+
+       intel_runtime_pm_put(&i915->runtime_pm, wakeref);
  }
  
  static void i915_pmu_disable(struct perf_event *event)
@@ -810,11 +822,6 @@ static ssize_t i915_pmu_event_show(struct device *dev,
         return sprintf(buf, "config=0x%lx\n", eattr->val);
  }
  
-static struct attribute_group i915_pmu_events_attr_group = {
-       .name = "events",
-       /* Patch in attrs at runtime. */
-};
-
  static ssize_t
  i915_pmu_get_attr_cpumask(struct device *dev,
                           struct device_attribute *attr,
@@ -834,13 +841,6 @@ static const struct attribute_group i915_pmu_cpumask_attr_group = {
         .attrs = i915_cpumask_attrs,
  };
  
-static const struct attribute_group *i915_pmu_attr_groups[] = {
-       &i915_pmu_format_attr_group,
-       &i915_pmu_events_attr_group,
-       &i915_pmu_cpumask_attr_group,
-       NULL
-};
-
  #define __event(__config, __name, __unit) \
  { \
         .config = (__config), \
@@ -1014,23 +1014,23 @@ err_alloc:
  
  static void free_event_attributes(struct i915_pmu *pmu)
  {
-       struct attribute **attr_iter = i915_pmu_events_attr_group.attrs;
+       struct attribute **attr_iter = pmu->events_attr_group.attrs;
  
         for (; *attr_iter; attr_iter++)
                 kfree((*attr_iter)->name);
  
-       kfree(i915_pmu_events_attr_group.attrs);
+       kfree(pmu->events_attr_group.attrs);
         kfree(pmu->i915_attr);
         kfree(pmu->pmu_attr);
  
-       i915_pmu_events_attr_group.attrs = NULL;
+       pmu->events_attr_group.attrs = NULL;
         pmu->i915_attr = NULL;
         pmu->pmu_attr = NULL;
  }
  
  static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
  {
-       struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), node);
+       struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
  
         GEM_BUG_ON(!pmu->base.event_init);
  
@@ -1043,7 +1043,7 @@ static int i915_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
  
  static int i915_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
  {
-       struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), node);
+       struct i915_pmu *pmu = hlist_entry_safe(node, typeof(*pmu), cpuhp.node);
         unsigned int target;
  
         GEM_BUG_ON(!pmu->base.event_init);
@@ -1060,8 +1060,6 @@ static int i915_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
         return 0;
  }
  
-static enum cpuhp_state cpuhp_slot = CPUHP_INVALID;
-
  static int i915_pmu_register_cpuhp_state(struct i915_pmu *pmu)
  {
         enum cpuhp_state slot;
@@ -1075,21 +1073,22 @@ static int i915_pmu_register_cpuhp_state(struct i915_pmu *pmu)
                 return ret;
  
         slot = ret;
-       ret = cpuhp_state_add_instance(slot, &pmu->node);
+       ret = cpuhp_state_add_instance(slot, &pmu->cpuhp.node);
         if (ret) {
                 cpuhp_remove_multi_state(slot);
                 return ret;
         }
  
-       cpuhp_slot = slot;
+       pmu->cpuhp.slot = slot;
         return 0;
  }
  
  static void i915_pmu_unregister_cpuhp_state(struct i915_pmu *pmu)
  {
-       WARN_ON(cpuhp_slot == CPUHP_INVALID);
-       WARN_ON(cpuhp_state_remove_instance(cpuhp_slot, &pmu->node));
-       cpuhp_remove_multi_state(cpuhp_slot);
+       WARN_ON(pmu->cpuhp.slot == CPUHP_INVALID);
+       WARN_ON(cpuhp_state_remove_instance(pmu->cpuhp.slot, &pmu->cpuhp.node));
+       cpuhp_remove_multi_state(pmu->cpuhp.slot);
+       pmu->cpuhp.slot = CPUHP_INVALID;
  }
  
  static bool is_igp(struct drm_i915_private *i915)
@@ -1106,6 +1105,13 @@ static bool is_igp(struct drm_i915_private *i915)
  void i915_pmu_register(struct drm_i915_private *i915)
  {
         struct i915_pmu *pmu = &i915->pmu;
+       const struct attribute_group *attr_groups[] = {
+               &i915_pmu_format_attr_group,
+               &pmu->events_attr_group,
+               &i915_pmu_cpumask_attr_group,
+               NULL
+       };
+
         int ret = -ENOMEM;
  
         if (INTEL_GEN(i915) <= 2) {
@@ -1116,6 +1122,7 @@ void i915_pmu_register(struct drm_i915_private *i915)
         spin_lock_init(&pmu->lock);
         hrtimer_init(&pmu->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
         pmu->timer.function = i915_sample;
+       pmu->cpuhp.slot = CPUHP_INVALID;
  
         if (!is_igp(i915)) {
                 pmu->name = kasprintf(GFP_KERNEL,
@@ -1131,11 +1138,16 @@ void i915_pmu_register(struct drm_i915_private *i915)
         if (!pmu->name)
                 goto err;
  
-       i915_pmu_events_attr_group.attrs = create_event_attributes(pmu);
-       if (!i915_pmu_events_attr_group.attrs)
+       pmu->events_attr_group.name = "events";
+       pmu->events_attr_group.attrs = create_event_attributes(pmu);
+       if (!pmu->events_attr_group.attrs)
                 goto err_name;
  
-       pmu->base.attr_groups   = i915_pmu_attr_groups;
+       pmu->base.attr_groups = kmemdup(attr_groups, sizeof(attr_groups),
+                                       GFP_KERNEL);
+       if (!pmu->base.attr_groups)
+               goto err_attr;
+
         pmu->base.task_ctx_nr   = perf_invalid_context;
         pmu->base.event_init    = i915_pmu_event_init;
         pmu->base.add           = i915_pmu_event_add;
@@ -1147,7 +1159,7 @@ void i915_pmu_register(struct drm_i915_private *i915)
  
         ret = perf_pmu_register(&pmu->base, pmu->name, -1);
         if (ret)
-               goto err_attr;
+               goto err_groups;
  
         ret = i915_pmu_register_cpuhp_state(pmu);
         if (ret)
@@ -1157,6 +1169,8 @@ void i915_pmu_register(struct drm_i915_private *i915)
  
  err_unreg:
         perf_pmu_unregister(&pmu->base);
+err_groups:
+       kfree(pmu->base.attr_groups);
  err_attr:
         pmu->base.event_init = NULL;
         free_event_attributes(pmu);
@@ -1182,6 +1196,7 @@ void i915_pmu_unregister(struct drm_i915_private *i915)
  
         perf_pmu_unregister(&pmu->base);
         pmu->base.event_init = NULL;
+       kfree(pmu->base.attr_groups);
         if (!is_igp(i915))
                 kfree(pmu->name);
         free_event_attributes(pmu);
diff --git a/drivers/gpu/drm/i915/i915_pmu.h b/drivers/gpu/drm/i915/i915_pmu.h

index 6c1647c5daf255ff4d27fb538dd2ec358ed782e2..f1d6cad0d7d576caa3999bf0b62fdfe9b9e656d2 100644 (file)
--- a/drivers/gpu/drm/i915/i915_pmu.h
+++ b/drivers/gpu/drm/i915/i915_pmu.h
@@ -39,9 +39,12 @@ struct i915_pmu_sample {
  
  struct i915_pmu {
         /**
-        * @node: List node for CPU hotplug handling.
+        * @cpuhp: Struct used for CPU hotplug handling.
          */
-       struct hlist_node node;
+       struct {
+               struct hlist_node node;
+               enum cpuhp_state slot;
+       } cpuhp;
         /**
          * @base: PMU base.
          */
@@ -104,6 +107,10 @@ struct i915_pmu {
          * @sleep_last: Last time GT parked for RC6 estimation.
          */
         ktime_t sleep_last;
+       /**
+        * @events_attr_group: Device events attribute group.
+        */
+       struct attribute_group events_attr_group;
         /**
          * @i915_attr: Memory block holding device attributes.
          */
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c

index be185886e4fcc56dcc9b62f3ebe61e08ee543554..f56b046a32de19e255b69e9baaeff6fd56d7d29a 100644 (file)
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -221,6 +221,8 @@ static void remove_from_engine(struct i915_request *rq)
                 locked = engine;
         }
         list_del_init(&rq->sched.link);
+       clear_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+       clear_bit(I915_FENCE_FLAG_HOLD, &rq->fence.flags);
         spin_unlock_irq(&locked->active.lock);
  }
  
@@ -408,8 +410,10 @@ bool __i915_request_submit(struct i915_request *request)
  xfer:  /* We may be recursing from the signal callback of another i915 fence */
         spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING);
  
-       if (!test_and_set_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags))
+       if (!test_and_set_bit(I915_FENCE_FLAG_ACTIVE, &request->fence.flags)) {
                 list_move_tail(&request->sched.link, &engine->active.requests);
+               clear_bit(I915_FENCE_FLAG_PQUEUE, &request->fence.flags);
+       }
  
         if (test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &request->fence.flags) &&
             !test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &request->fence.flags) &&
@@ -591,6 +595,8 @@ static void __i915_request_ctor(void *arg)
         i915_sw_fence_init(&rq->submit, submit_notify);
         i915_sw_fence_init(&rq->semaphore, semaphore_notify);
  
+       dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock, 0, 0);
+
         rq->file_priv = NULL;
         rq->capture_list = NULL;
  
@@ -649,25 +655,30 @@ __i915_request_create(struct intel_context *ce, gfp_t gfp)
                 }
         }
  
-       ret = intel_timeline_get_seqno(tl, rq, &seqno);
-       if (ret)
-               goto err_free;
-
         rq->i915 = ce->engine->i915;
         rq->context = ce;
         rq->engine = ce->engine;
         rq->ring = ce->ring;
         rq->execution_mask = ce->engine->mask;
  
+       kref_init(&rq->fence.refcount);
+       rq->fence.flags = 0;
+       rq->fence.error = 0;
+       INIT_LIST_HEAD(&rq->fence.cb_list);
+
+       ret = intel_timeline_get_seqno(tl, rq, &seqno);
+       if (ret)
+               goto err_free;
+
+       rq->fence.context = tl->fence_context;
+       rq->fence.seqno = seqno;
+
         RCU_INIT_POINTER(rq->timeline, tl);
         RCU_INIT_POINTER(rq->hwsp_cacheline, tl->hwsp_cacheline);
         rq->hwsp_seqno = tl->hwsp_seqno;
  
         rq->rcustate = get_state_synchronize_rcu(); /* acts as smp_mb() */
  
-       dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock,
-                      tl->fence_context, seqno);
-
         /* We bump the ref for the fence chain */
         i915_sw_fence_reinit(&i915_request_get(rq)->submit);
         i915_sw_fence_reinit(&i915_request_get(rq)->semaphore);
diff --git a/drivers/gpu/drm/i915/i915_request.h b/drivers/gpu/drm/i915/i915_request.h

index 031433691a06f2b8b74ea286e1afd0a400b21d40..f57eadcf3583a4888c61bbcc0b419442658af392 100644 (file)
--- a/drivers/gpu/drm/i915/i915_request.h
+++ b/drivers/gpu/drm/i915/i915_request.h
@@ -70,6 +70,18 @@ enum {
          */
         I915_FENCE_FLAG_ACTIVE = DMA_FENCE_FLAG_USER_BITS,
  
+       /*
+        * I915_FENCE_FLAG_PQUEUE - this request is ready for execution
+        *
+        * Using the scheduler, when a request is ready for execution it is put
+        * into the priority queue, and removed from that queue when transferred
+        * to the HW runlists. We want to track its membership within the
+        * priority queue so that we can easily check before rescheduling.
+        *
+        * See i915_request_in_priority_queue()
+        */
+       I915_FENCE_FLAG_PQUEUE,
+
         /*
          * I915_FENCE_FLAG_SIGNAL - this request is currently on signal_list
          *
@@ -78,6 +90,13 @@ enum {
          */
         I915_FENCE_FLAG_SIGNAL,
  
+       /*
+        * I915_FENCE_FLAG_HOLD - this request is currently on hold
+        *
+        * This request has been suspended, pending an ongoing investigation.
+        */
+       I915_FENCE_FLAG_HOLD,
+
         /*
          * I915_FENCE_FLAG_NOPREEMPT - this request should not be preempted
          *
@@ -361,6 +380,11 @@ static inline bool i915_request_is_active(const struct i915_request *rq)
         return test_bit(I915_FENCE_FLAG_ACTIVE, &rq->fence.flags);
  }
  
+static inline bool i915_request_in_priority_queue(const struct i915_request *rq)
+{
+       return test_bit(I915_FENCE_FLAG_PQUEUE, &rq->fence.flags);
+}
+
  /**
   * Returns true if seq1 is later than seq2.
   */
@@ -454,6 +478,27 @@ static inline bool i915_request_is_running(const struct i915_request *rq)
         return __i915_request_has_started(rq);
  }
  
+/**
+ * i915_request_is_running - check if the request is ready for execution
+ * @rq: the request
+ *
+ * Upon construction, the request is instructed to wait upon various
+ * signals before it is ready to be executed by the HW. That is, we do
+ * not want to start execution and read data before it is written. In practice,
+ * this is controlled with a mixture of interrupts and semaphores. Once
+ * the submit fence is completed, the backend scheduler will place the
+ * request into its queue and from there submit it for execution. So we
+ * can detect when a request is eligible for execution (and is under control
+ * of the scheduler) by querying where it is in any of the scheduler's lists.
+ *
+ * Returns true if the request is ready for execution (it may be inflight),
+ * false otherwise.
+ */
+static inline bool i915_request_is_ready(const struct i915_request *rq)
+{
+       return !list_empty(&rq->sched.link);
+}
+
  static inline bool i915_request_completed(const struct i915_request *rq)
  {
         if (i915_request_signaled(rq))
@@ -483,6 +528,21 @@ static inline bool i915_request_has_sentinel(const struct i915_request *rq)
         return unlikely(test_bit(I915_FENCE_FLAG_SENTINEL, &rq->fence.flags));
  }
  
+static inline bool i915_request_on_hold(const struct i915_request *rq)
+{
+       return unlikely(test_bit(I915_FENCE_FLAG_HOLD, &rq->fence.flags));
+}
+
+static inline void i915_request_set_hold(struct i915_request *rq)
+{
+       set_bit(I915_FENCE_FLAG_HOLD, &rq->fence.flags);
+}
+
+static inline void i915_request_clear_hold(struct i915_request *rq)
+{
+       clear_bit(I915_FENCE_FLAG_HOLD, &rq->fence.flags);
+}
+
  static inline struct intel_timeline *
  i915_request_timeline(struct i915_request *rq)
  {
diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c

index bf87c70bfdd9630b50af237503e5389c478e2715..34b654b4e58af2b7fb3eb22b4bcecd65c74c9e2f 100644 (file)
--- a/drivers/gpu/drm/i915/i915_scheduler.c
+++ b/drivers/gpu/drm/i915/i915_scheduler.c
@@ -326,20 +326,18 @@ static void __i915_schedule(struct i915_sched_node *node,
  
                 node->attr.priority = prio;
  
-               if (list_empty(&node->link)) {
-                       /*
-                        * If the request is not in the priolist queue because
-                        * it is not yet runnable, then it doesn't contribute
-                        * to our preemption decisions. On the other hand,
-                        * if the request is on the HW, it too is not in the
-                        * queue; but in that case we may still need to reorder
-                        * the inflight requests.
-                        */
+               /*
+                * Once the request is ready, it will be placed into the
+                * priority lists and then onto the HW runlist. Before the
+                * request is ready, it does not contribute to our preemption
+                * decisions and we can safely ignore it, as it will, and
+                * any preemption required, be dealt with upon submission.
+                * See engine->submit_request()
+                */
+               if (list_empty(&node->link))
                         continue;
-               }
  
-               if (!intel_engine_is_virtual(engine) &&
-                   !i915_request_is_active(node_to_request(node))) {
+               if (i915_request_in_priority_queue(node_to_request(node))) {
                         if (!cache.priolist)
                                 cache.priolist =
                                         i915_sched_lookup_priolist(engine,
@@ -425,8 +423,6 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
  
         if (!node_signaled(signal)) {
                 INIT_LIST_HEAD(&dep->dfs_link);
-               list_add(&dep->wait_link, &signal->waiters_list);
-               list_add(&dep->signal_link, &node->signalers_list);
                 dep->signaler = signal;
                 dep->waiter = node;
                 dep->flags = flags;
@@ -436,6 +432,10 @@ bool __i915_sched_node_add_dependency(struct i915_sched_node *node,
                     !node_started(signal))
                         node->flags |= I915_SCHED_HAS_SEMAPHORE_CHAIN;
  
+               /* All set, now publish. Beware the lockless walkers. */
+               list_add(&dep->signal_link, &node->signalers_list);
+               list_add_rcu(&dep->wait_link, &signal->waiters_list);
+
                 /*
                  * As we do not allow WAIT to preempt inflight requests,
                  * once we have executed a request, along with triggering
diff --git a/drivers/gpu/drm/i915/i915_utils.c b/drivers/gpu/drm/i915/i915_utils.c

index c47261ae86eab09b1341059a3ec9cb36db6767a8..632d6953c78da92bfe4edbcf58407984a0c39c47 100644 (file)
--- a/drivers/gpu/drm/i915/i915_utils.c
+++ b/drivers/gpu/drm/i915/i915_utils.c
@@ -8,9 +8,8 @@
  #include "i915_drv.h"
  #include "i915_utils.h"
  
-#define FDO_BUG_URL "https://bugs.freedesktop.org/enter_bug.cgi?product=DRI"
-#define FDO_BUG_MSG "Please file a bug at " FDO_BUG_URL " against DRM/Intel " \
-                   "providing the dmesg log by booting with drm.debug=0xf"
+#define FDO_BUG_URL "https://gitlab.freedesktop.org/drm/intel/-/wikis/How-to-file-i915-bugs"
+#define FDO_BUG_MSG "Please file a bug on drm/i915; see " FDO_BUG_URL " for details."
  
  void
  __i915_printk(struct drm_i915_private *dev_priv, const char *level,
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c

index 17d7c525ea5cdd104ec7364a7347f76abdbdd700..4ff380770b32936dea3c8eaf9c3e7f31cb342d35 100644 (file)
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -1202,16 +1202,26 @@ int __i915_vma_unbind(struct i915_vma *vma)
         if (ret)
                 return ret;
  
-       GEM_BUG_ON(i915_vma_is_active(vma));
         if (i915_vma_is_pinned(vma)) {
                 vma_print_allocator(vma, "is pinned");
                 return -EAGAIN;
         }
  
-       GEM_BUG_ON(i915_vma_is_active(vma));
+       /*
+        * After confirming that no one else is pinning this vma, wait for
+        * any laggards who may have crept in during the wait (through
+        * a residual pin skipping the vm->mutex) to complete.
+        */
+       ret = i915_vma_sync(vma);
+       if (ret)
+               return ret;
+
         if (!drm_mm_node_allocated(&vma->node))
                 return 0;
  
+       GEM_BUG_ON(i915_vma_is_pinned(vma));
+       GEM_BUG_ON(i915_vma_is_active(vma));
+
         if (i915_vma_is_map_and_fenceable(vma)) {
                 /*
                  * Check that we have flushed all writes through the GGTT
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c

index 983afeaee737ea27593f988ae5eb00a68e0a1c76..748cd379065f1103240ecf4dd4af72d635f63318 100644 (file)
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -796,12 +796,41 @@ bool a6xx_gmu_isidle(struct a6xx_gmu *gmu)
         return true;
  }
  
+#define GBIF_CLIENT_HALT_MASK             BIT(0)
+#define GBIF_ARB_HALT_MASK                BIT(1)
+
+static void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu)
+{
+       struct msm_gpu *gpu = &adreno_gpu->base;
+
+       if (!a6xx_has_gbif(adreno_gpu)) {
+               gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0xf);
+               spin_until((gpu_read(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL1) &
+                                                               0xf) == 0xf);
+               gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0);
+
+               return;
+       }
+
+       /* Halt new client requests on GBIF */
+       gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_CLIENT_HALT_MASK);
+       spin_until((gpu_read(gpu, REG_A6XX_GBIF_HALT_ACK) &
+                       (GBIF_CLIENT_HALT_MASK)) == GBIF_CLIENT_HALT_MASK);
+
+       /* Halt all AXI requests on GBIF */
+       gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_ARB_HALT_MASK);
+       spin_until((gpu_read(gpu,  REG_A6XX_GBIF_HALT_ACK) &
+                       (GBIF_ARB_HALT_MASK)) == GBIF_ARB_HALT_MASK);
+
+       /* The GBIF halt needs to be explicitly cleared */
+       gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
+}
+
  /* Gracefully try to shut down the GMU and by extension the GPU */
  static void a6xx_gmu_shutdown(struct a6xx_gmu *gmu)
  {
         struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu);
         struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
-       struct msm_gpu *gpu = &adreno_gpu->base;
         u32 val;
  
         /*
@@ -819,11 +848,7 @@ static void a6xx_gmu_shutdown(struct a6xx_gmu *gmu)
                         return;
                 }
  
-               /* Clear the VBIF pipe before shutting down */
-               gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0xf);
-               spin_until((gpu_read(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL1) & 0xf)
-                       == 0xf);
-               gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0);
+               a6xx_bus_clear_pending_transactions(adreno_gpu);
  
                 /* tell the GMU we want to slumber */
                 a6xx_gmu_notify_slumber(gmu);
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c

index daf07800cde02a1ecc4797508dc1a631f38892da..68af24150de57c626f31231a06ab364b844da1b9 100644 (file)
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -378,18 +378,6 @@ static int a6xx_hw_init(struct msm_gpu *gpu)
         struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
         int ret;
  
-       /*
-        * During a previous slumber, GBIF halt is asserted to ensure
-        * no further transaction can go through GPU before GPU
-        * headswitch is turned off.
-        *
-        * This halt is deasserted once headswitch goes off but
-        * incase headswitch doesn't goes off clear GBIF halt
-        * here to ensure GPU wake-up doesn't fail because of
-        * halted GPU transactions.
-        */
-       gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
-
         /* Make sure the GMU keeps the GPU on while we set it up */
         a6xx_gmu_set_oob(&a6xx_gpu->gmu, GMU_OOB_GPU_SET);
  
@@ -470,10 +458,12 @@ static int a6xx_hw_init(struct msm_gpu *gpu)
         /* Select CP0 to always count cycles */
         gpu_write(gpu, REG_A6XX_CP_PERFCTR_CP_SEL_0, PERF_CP_ALWAYS_COUNT);
  
-       gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL, 2 << 1);
-       gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, 2 << 1);
-       gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, 2 << 1);
-       gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, 2 << 21);
+       if (adreno_is_a630(adreno_gpu)) {
+               gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL, 2 << 1);
+               gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, 2 << 1);
+               gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, 2 << 1);
+               gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, 2 << 21);
+       }
  
         /* Enable fault detection */
         gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL,
@@ -748,39 +738,6 @@ static const u32 a6xx_register_offsets[REG_ADRENO_REGISTER_MAX] = {
         REG_ADRENO_DEFINE(REG_ADRENO_CP_RB_CNTL, REG_A6XX_CP_RB_CNTL),
  };
  
-#define GBIF_CLIENT_HALT_MASK             BIT(0)
-#define GBIF_ARB_HALT_MASK                BIT(1)
-
-static void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu)
-{
-       struct msm_gpu *gpu = &adreno_gpu->base;
-
-       if(!a6xx_has_gbif(adreno_gpu)){
-               gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0xf);
-               spin_until((gpu_read(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL1) &
-                                                               0xf) == 0xf);
-               gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0);
-
-               return;
-       }
-
-       /* Halt new client requests on GBIF */
-       gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_CLIENT_HALT_MASK);
-       spin_until((gpu_read(gpu, REG_A6XX_GBIF_HALT_ACK) &
-                       (GBIF_CLIENT_HALT_MASK)) == GBIF_CLIENT_HALT_MASK);
-
-       /* Halt all AXI requests on GBIF */
-       gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_ARB_HALT_MASK);
-       spin_until((gpu_read(gpu,  REG_A6XX_GBIF_HALT_ACK) &
-                       (GBIF_ARB_HALT_MASK)) == GBIF_ARB_HALT_MASK);
-
-       /*
-        * GMU needs DDR access in slumber path. Deassert GBIF halt now
-        * to allow for GMU to access system memory.
-        */
-       gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
-}
-
  static int a6xx_pm_resume(struct msm_gpu *gpu)
  {
         struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
@@ -805,16 +762,6 @@ static int a6xx_pm_suspend(struct msm_gpu *gpu)
  
         devfreq_suspend_device(gpu->devfreq.devfreq);
  
-       /*
-        * Make sure the GMU is idle before continuing (because some transitions
-        * may use VBIF
-        */
-       a6xx_gmu_wait_for_idle(&a6xx_gpu->gmu);
-
-       /* Clear the VBIF pipe before shutting down */
-       /* FIXME: This accesses the GPU - do we need to make sure it is on? */
-       a6xx_bus_clear_pending_transactions(adreno_gpu);
-
         return a6xx_gmu_stop(a6xx_gpu);
  }
  
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c

index eda11abc5f011f1a8ef8bab3feb19274d6c0277f..e450e0b97211533160878013cb72ac0a358d927b 100644 (file)
--- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c
@@ -7,6 +7,7 @@
  
  #include "a6xx_gmu.h"
  #include "a6xx_gmu.xml.h"
+#include "a6xx_gpu.h"
  
  #define HFI_MSG_ID(val) [val] = #val
  
@@ -216,48 +217,82 @@ static int a6xx_hfi_send_perf_table(struct a6xx_gmu *gmu)
                 NULL, 0);
  }
  
-static int a6xx_hfi_send_bw_table(struct a6xx_gmu *gmu)
+static void a618_build_bw_table(struct a6xx_hfi_msg_bw_table *msg)
  {
-       struct a6xx_hfi_msg_bw_table msg = { 0 };
+       /* Send a single "off" entry since the 618 GMU doesn't do bus scaling */
+       msg->bw_level_num = 1;
+
+       msg->ddr_cmds_num = 3;
+       msg->ddr_wait_bitmask = 0x01;
+
+       msg->ddr_cmds_addrs[0] = 0x50000;
+       msg->ddr_cmds_addrs[1] = 0x5003c;
+       msg->ddr_cmds_addrs[2] = 0x5000c;
+
+       msg->ddr_cmds_data[0][0] =  0x40000000;
+       msg->ddr_cmds_data[0][1] =  0x40000000;
+       msg->ddr_cmds_data[0][2] =  0x40000000;
  
         /*
-        * The sdm845 GMU doesn't do bus frequency scaling on its own but it
-        * does need at least one entry in the list because it might be accessed
-        * when the GMU is shutting down. Send a single "off" entry.
+        * These are the CX (CNOC) votes - these are used by the GMU but the
+        * votes are known and fixed for the target
          */
+       msg->cnoc_cmds_num = 1;
+       msg->cnoc_wait_bitmask = 0x01;
+
+       msg->cnoc_cmds_addrs[0] = 0x5007c;
+       msg->cnoc_cmds_data[0][0] =  0x40000000;
+       msg->cnoc_cmds_data[1][0] =  0x60000001;
+}
  
-       msg.bw_level_num = 1;
+static void a6xx_build_bw_table(struct a6xx_hfi_msg_bw_table *msg)
+{
+       /* Send a single "off" entry since the 630 GMU doesn't do bus scaling */
+       msg->bw_level_num = 1;
  
-       msg.ddr_cmds_num = 3;
-       msg.ddr_wait_bitmask = 0x07;
+       msg->ddr_cmds_num = 3;
+       msg->ddr_wait_bitmask = 0x07;
  
-       msg.ddr_cmds_addrs[0] = 0x50000;
-       msg.ddr_cmds_addrs[1] = 0x5005c;
-       msg.ddr_cmds_addrs[2] = 0x5000c;
+       msg->ddr_cmds_addrs[0] = 0x50000;
+       msg->ddr_cmds_addrs[1] = 0x5005c;
+       msg->ddr_cmds_addrs[2] = 0x5000c;
  
-       msg.ddr_cmds_data[0][0] =  0x40000000;
-       msg.ddr_cmds_data[0][1] =  0x40000000;
-       msg.ddr_cmds_data[0][2] =  0x40000000;
+       msg->ddr_cmds_data[0][0] =  0x40000000;
+       msg->ddr_cmds_data[0][1] =  0x40000000;
+       msg->ddr_cmds_data[0][2] =  0x40000000;
  
         /*
          * These are the CX (CNOC) votes.  This is used but the values for the
          * sdm845 GMU are known and fixed so we can hard code them.
          */
  
-       msg.cnoc_cmds_num = 3;
-       msg.cnoc_wait_bitmask = 0x05;
+       msg->cnoc_cmds_num = 3;
+       msg->cnoc_wait_bitmask = 0x05;
  
-       msg.cnoc_cmds_addrs[0] = 0x50034;
-       msg.cnoc_cmds_addrs[1] = 0x5007c;
-       msg.cnoc_cmds_addrs[2] = 0x5004c;
+       msg->cnoc_cmds_addrs[0] = 0x50034;
+       msg->cnoc_cmds_addrs[1] = 0x5007c;
+       msg->cnoc_cmds_addrs[2] = 0x5004c;
  
-       msg.cnoc_cmds_data[0][0] =  0x40000000;
-       msg.cnoc_cmds_data[0][1] =  0x00000000;
-       msg.cnoc_cmds_data[0][2] =  0x40000000;
+       msg->cnoc_cmds_data[0][0] =  0x40000000;
+       msg->cnoc_cmds_data[0][1] =  0x00000000;
+       msg->cnoc_cmds_data[0][2] =  0x40000000;
+
+       msg->cnoc_cmds_data[1][0] =  0x60000001;
+       msg->cnoc_cmds_data[1][1] =  0x20000001;
+       msg->cnoc_cmds_data[1][2] =  0x60000001;
+}
+
+
+static int a6xx_hfi_send_bw_table(struct a6xx_gmu *gmu)
+{
+       struct a6xx_hfi_msg_bw_table msg = { 0 };
+       struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu);
+       struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
  
-       msg.cnoc_cmds_data[1][0] =  0x60000001;
-       msg.cnoc_cmds_data[1][1] =  0x20000001;
-       msg.cnoc_cmds_data[1][2] =  0x60000001;
+       if (adreno_is_a618(adreno_gpu))
+               a618_build_bw_table(&msg);
+       else
+               a6xx_build_bw_table(&msg);
  
         return a6xx_hfi_send_msg(gmu, HFI_H2F_MSG_BW_TABLE, &msg, sizeof(msg),
                 NULL, 0);
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c

index 528632690f1ef4f6d0ab476f10d3c26e68540d1c..a05282dede91b6530d4a5d01a1df88dd90ee84a1 100644 (file)
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c
@@ -255,13 +255,13 @@ static const struct dpu_format dpu_format_map[] = {
  
         INTERLEAVED_RGB_FMT(RGB565,
                 0, COLOR_5BIT, COLOR_6BIT, COLOR_5BIT,
-               C2_R_Cr, C0_G_Y, C1_B_Cb, 0, 3,
+               C1_B_Cb, C0_G_Y, C2_R_Cr, 0, 3,
                 false, 2, 0,
                 DPU_FETCH_LINEAR, 1),
  
         INTERLEAVED_RGB_FMT(BGR565,
                 0, COLOR_5BIT, COLOR_6BIT, COLOR_5BIT,
-               C1_B_Cb, C0_G_Y, C2_R_Cr, 0, 3,
+               C2_R_Cr, C0_G_Y, C1_B_Cb, 0, 3,
                 false, 2, 0,
                 DPU_FETCH_LINEAR, 1),
  
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c

index 29705e773a4b7035d6f9981a6a846deaf60231a8..80d3cfc140070d0cd3006981fb473a3cc453710e 100644 (file)
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_mdss.c
@@ -12,6 +12,7 @@
  
  #define to_dpu_mdss(x) container_of(x, struct dpu_mdss, base)
  
+#define HW_REV                         0x0
  #define HW_INTR_STATUS                 0x0010
  
  /* Max BW defined in KBps */
@@ -22,6 +23,17 @@ struct dpu_irq_controller {
         struct irq_domain *domain;
  };
  
+struct dpu_hw_cfg {
+       u32 val;
+       u32 offset;
+};
+
+struct dpu_mdss_hw_init_handler {
+       u32 hw_rev;
+       u32 hw_reg_count;
+       struct dpu_hw_cfg* hw_cfg;
+};
+
  struct dpu_mdss {
         struct msm_mdss base;
         void __iomem *mmio;
@@ -32,6 +44,44 @@ struct dpu_mdss {
         u32 num_paths;
  };
  
+static struct dpu_hw_cfg hw_cfg[] = {
+    {
+       /* UBWC global settings */
+       .val = 0x1E,
+       .offset = 0x144,
+    }
+};
+
+static struct dpu_mdss_hw_init_handler cfg_handler[] = {
+    { .hw_rev = DPU_HW_VER_620,
+      .hw_reg_count = ARRAY_SIZE(hw_cfg),
+      .hw_cfg = hw_cfg
+    },
+};
+
+static void dpu_mdss_hw_init(struct dpu_mdss *dpu_mdss, u32 hw_rev)
+{
+       int i;
+       u32 count = 0;
+       struct dpu_hw_cfg *hw_cfg = NULL;
+
+       for (i = 0; i < ARRAY_SIZE(cfg_handler); i++) {
+               if (cfg_handler[i].hw_rev == hw_rev) {
+                       hw_cfg = cfg_handler[i].hw_cfg;
+                       count = cfg_handler[i].hw_reg_count;
+                       break;
+           }
+       }
+
+       for (i = 0; i < count; i++ ) {
+               writel_relaxed(hw_cfg->val,
+                       dpu_mdss->mmio + hw_cfg->offset);
+               hw_cfg++;
+       }
+
+    return;
+}
+
  static int dpu_mdss_parse_data_bus_icc_path(struct drm_device *dev,
                                                 struct dpu_mdss *dpu_mdss)
  {
@@ -174,12 +224,18 @@ static int dpu_mdss_enable(struct msm_mdss *mdss)
         struct dpu_mdss *dpu_mdss = to_dpu_mdss(mdss);
         struct dss_module_power *mp = &dpu_mdss->mp;
         int ret;
+       u32 mdss_rev;
  
         dpu_mdss_icc_request_bw(mdss);
  
         ret = msm_dss_enable_clk(mp->clk_config, mp->num_clk, true);
-       if (ret)
+       if (ret) {
                 DPU_ERROR("clock enable failed, ret:%d\n", ret);
+               return ret;
+       }
+
+       mdss_rev = readl_relaxed(dpu_mdss->mmio + HW_REV);
+       dpu_mdss_hw_init(dpu_mdss, mdss_rev);
  
         return ret;
  }
diff --git a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c

index 05cc04f729d638963325b03a288ad502cccae822..e1cc541e0ef2e37a44d20378c1e3b37a493578d6 100644 (file)
--- a/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
+++ b/drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c
@@ -1109,8 +1109,8 @@ static void mdp5_crtc_wait_for_pp_done(struct drm_crtc *crtc)
         ret = wait_for_completion_timeout(&mdp5_crtc->pp_completion,
                                                 msecs_to_jiffies(50));
         if (ret == 0)
-               dev_warn(dev->dev, "pp done time out, lm=%d\n",
-                        mdp5_cstate->pipeline.mixer->lm);
+               dev_warn_ratelimited(dev->dev, "pp done time out, lm=%d\n",
+                                    mdp5_cstate->pipeline.mixer->lm);
  }
  
  static void mdp5_crtc_wait_for_flush_done(struct drm_crtc *crtc)
diff --git a/drivers/gpu/drm/msm/dsi/dsi_manager.c b/drivers/gpu/drm/msm/dsi/dsi_manager.c

index 104115d112eba6aebc87cb464c9b89a3c513b945..4864b9558f65ab6674a2df30444262afcca09405 100644 (file)
--- a/drivers/gpu/drm/msm/dsi/dsi_manager.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_manager.c
@@ -336,7 +336,7 @@ static int dsi_mgr_connector_get_modes(struct drm_connector *connector)
         return num;
  }
  
-static int dsi_mgr_connector_mode_valid(struct drm_connector *connector,
+static enum drm_mode_status dsi_mgr_connector_mode_valid(struct drm_connector *connector,
                                 struct drm_display_mode *mode)
  {
         int id = dsi_mgr_connector_get_id(connector);
@@ -506,6 +506,7 @@ static void dsi_mgr_bridge_post_disable(struct drm_bridge *bridge)
         struct msm_dsi *msm_dsi1 = dsi_mgr_get_dsi(DSI_1);
         struct mipi_dsi_host *host = msm_dsi->host;
         struct drm_panel *panel = msm_dsi->panel;
+       struct msm_dsi_pll *src_pll;
         bool is_dual_dsi = IS_DUAL_DSI();
         int ret;
  
@@ -539,6 +540,10 @@ static void dsi_mgr_bridge_post_disable(struct drm_bridge *bridge)
                                                                 id, ret);
         }
  
+       /* Save PLL status if it is a clock source */
+       src_pll = msm_dsi_phy_get_pll(msm_dsi->phy);
+       msm_dsi_pll_save_state(src_pll);
+
         ret = msm_dsi_host_power_off(host);
         if (ret)
                 pr_err("%s: host %d power off failed,%d\n", __func__, id, ret);
diff --git a/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c b/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c

index b0cfa67d2a57806bc1e9cd7cb4846ebcfe8c08bd..f509ebd77500f43bbcd846de9c1ff44e5ec2eb2e 100644 (file)
--- a/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c
+++ b/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c
@@ -724,10 +724,6 @@ void msm_dsi_phy_disable(struct msm_dsi_phy *phy)
         if (!phy || !phy->cfg->ops.disable)
                 return;
  
-       /* Save PLL status if it is a clock source */
-       if (phy->usecase != MSM_DSI_PHY_SLAVE)
-               msm_dsi_pll_save_state(phy->pll);
-
         phy->cfg->ops.disable(phy);
  
         dsi_phy_regulator_disable(phy);
diff --git a/drivers/gpu/drm/msm/dsi/pll/dsi_pll_10nm.c b/drivers/gpu/drm/msm/dsi/pll/dsi_pll_10nm.c

index 1c894548dd725c8365bc4eae9b98446ea36739e1..6ac04fc303f5699ba29b356e714c35ea96b6f12e 100644 (file)
--- a/drivers/gpu/drm/msm/dsi/pll/dsi_pll_10nm.c
+++ b/drivers/gpu/drm/msm/dsi/pll/dsi_pll_10nm.c
@@ -411,6 +411,12 @@ static int dsi_pll_10nm_vco_prepare(struct clk_hw *hw)
         if (pll_10nm->slave)
                 dsi_pll_enable_pll_bias(pll_10nm->slave);
  
+       rc = dsi_pll_10nm_vco_set_rate(hw,pll_10nm->vco_current_rate, 0);
+       if (rc) {
+               pr_err("vco_set_rate failed, rc=%d\n", rc);
+               return rc;
+       }
+
         /* Start PLL */
         pll_write(pll_10nm->phy_cmn_mmio + REG_DSI_10nm_PHY_CMN_PLL_CNTRL,
                   0x01);
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c

index c26219c7a49fd18b29648a929a92453b4c894788..e4b750b0c2d3fcdad40761fe6cae1597b2844164 100644 (file)
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -441,6 +441,14 @@ static int msm_drm_init(struct device *dev, struct drm_driver *drv)
         if (ret)
                 goto err_msm_uninit;
  
+       if (!dev->dma_parms) {
+               dev->dma_parms = devm_kzalloc(dev, sizeof(*dev->dma_parms),
+                                             GFP_KERNEL);
+               if (!dev->dma_parms)
+                       return -ENOMEM;
+       }
+       dma_set_max_seg_size(dev, DMA_BIT_MASK(32));
+
         msm_gem_shrinker_init(ddev);
  
         switch (get_mdp_ver(pdev)) {
diff --git a/drivers/gpu/drm/nouveau/dispnv50/wndw.c b/drivers/gpu/drm/nouveau/dispnv50/wndw.c

index 890315291b01eff70a13ca6a3232517f970f4e65..bb737f9281e692f2adf4a3c8e93d7081b022b7ea 100644 (file)
--- a/drivers/gpu/drm/nouveau/dispnv50/wndw.c
+++ b/drivers/gpu/drm/nouveau/dispnv50/wndw.c
@@ -458,6 +458,8 @@ nv50_wndw_atomic_check(struct drm_plane *plane, struct drm_plane_state *state)
                 asyw->clr.ntfy = armw->ntfy.handle != 0;
                 asyw->clr.sema = armw->sema.handle != 0;
                 asyw->clr.xlut = armw->xlut.handle != 0;
+               if (asyw->clr.xlut && asyw->visible)
+                       asyw->set.xlut = asyw->xlut.handle != 0;
                 asyw->clr.csc  = armw->csc.valid;
                 if (wndw->func->image_clr)
                         asyw->clr.image = armw->image.handle[0] != 0;
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c

index c7d700916eae7356fa8f9e7e501c9a1efe255967..8ebbe16560083dc573b41dc0336d217264d0909d 100644 (file)
--- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
@@ -2579,6 +2579,7 @@ nv166_chipset = {
  static const struct nvkm_device_chip
  nv167_chipset = {
         .name = "TU117",
+       .acr = tu102_acr_new,
         .bar = tu102_bar_new,
         .bios = nvkm_bios_new,
         .bus = gf100_bus_new,
@@ -2607,6 +2608,7 @@ nv167_chipset = {
         .disp = tu102_disp_new,
         .dma = gv100_dma_new,
         .fifo = tu102_fifo_new,
+       .gr = tu102_gr_new,
         .nvdec[0] = gm107_nvdec_new,
         .nvenc[0] = gm107_nvenc_new,
         .sec2 = tu102_sec2_new,
@@ -2615,6 +2617,7 @@ nv167_chipset = {
  static const struct nvkm_device_chip
  nv168_chipset = {
         .name = "TU116",
+       .acr = tu102_acr_new,
         .bar = tu102_bar_new,
         .bios = nvkm_bios_new,
         .bus = gf100_bus_new,
@@ -2643,6 +2646,7 @@ nv168_chipset = {
         .disp = tu102_disp_new,
         .dma = gv100_dma_new,
         .fifo = tu102_fifo_new,
+       .gr = tu102_gr_new,
         .nvdec[0] = gm107_nvdec_new,
         .nvenc[0] = gm107_nvenc_new,
         .sec2 = tu102_sec2_new,
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.c b/drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.c

index 454668b1cf54d5975f59a6694c5fd474845fb0f5..a9efa4d78be92a808f41e9a9b9329ec0d99cb5f6 100644 (file)
--- a/drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/gr/tu102.c
@@ -164,6 +164,32 @@ MODULE_FIRMWARE("nvidia/tu106/gr/sw_nonctx.bin");
  MODULE_FIRMWARE("nvidia/tu106/gr/sw_bundle_init.bin");
  MODULE_FIRMWARE("nvidia/tu106/gr/sw_method_init.bin");
  
+MODULE_FIRMWARE("nvidia/tu117/gr/fecs_bl.bin");
+MODULE_FIRMWARE("nvidia/tu117/gr/fecs_inst.bin");
+MODULE_FIRMWARE("nvidia/tu117/gr/fecs_data.bin");
+MODULE_FIRMWARE("nvidia/tu117/gr/fecs_sig.bin");
+MODULE_FIRMWARE("nvidia/tu117/gr/gpccs_bl.bin");
+MODULE_FIRMWARE("nvidia/tu117/gr/gpccs_inst.bin");
+MODULE_FIRMWARE("nvidia/tu117/gr/gpccs_data.bin");
+MODULE_FIRMWARE("nvidia/tu117/gr/gpccs_sig.bin");
+MODULE_FIRMWARE("nvidia/tu117/gr/sw_ctx.bin");
+MODULE_FIRMWARE("nvidia/tu117/gr/sw_nonctx.bin");
+MODULE_FIRMWARE("nvidia/tu117/gr/sw_bundle_init.bin");
+MODULE_FIRMWARE("nvidia/tu117/gr/sw_method_init.bin");
+
+MODULE_FIRMWARE("nvidia/tu116/gr/fecs_bl.bin");
+MODULE_FIRMWARE("nvidia/tu116/gr/fecs_inst.bin");
+MODULE_FIRMWARE("nvidia/tu116/gr/fecs_data.bin");
+MODULE_FIRMWARE("nvidia/tu116/gr/fecs_sig.bin");
+MODULE_FIRMWARE("nvidia/tu116/gr/gpccs_bl.bin");
+MODULE_FIRMWARE("nvidia/tu116/gr/gpccs_inst.bin");
+MODULE_FIRMWARE("nvidia/tu116/gr/gpccs_data.bin");
+MODULE_FIRMWARE("nvidia/tu116/gr/gpccs_sig.bin");
+MODULE_FIRMWARE("nvidia/tu116/gr/sw_ctx.bin");
+MODULE_FIRMWARE("nvidia/tu116/gr/sw_nonctx.bin");
+MODULE_FIRMWARE("nvidia/tu116/gr/sw_bundle_init.bin");
+MODULE_FIRMWARE("nvidia/tu116/gr/sw_method_init.bin");
+
  static const struct gf100_gr_fwif
  tu102_gr_fwif[] = {
         { 0, gm200_gr_load, &tu102_gr, &gp108_gr_fecs_acr, &gp108_gr_gpccs_acr },
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/tu102.c b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/tu102.c

index 7f4b89d82d320a0b7ec6e0e2fe4b981b569ff2b0..d28d8f36ae2484ac4d237ad0fb3750b2d52b53b7 100644 (file)
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/tu102.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/tu102.c
@@ -107,6 +107,12 @@ MODULE_FIRMWARE("nvidia/tu104/acr/ucode_unload.bin");
  MODULE_FIRMWARE("nvidia/tu106/acr/unload_bl.bin");
  MODULE_FIRMWARE("nvidia/tu106/acr/ucode_unload.bin");
  
+MODULE_FIRMWARE("nvidia/tu116/acr/unload_bl.bin");
+MODULE_FIRMWARE("nvidia/tu116/acr/ucode_unload.bin");
+
+MODULE_FIRMWARE("nvidia/tu117/acr/unload_bl.bin");
+MODULE_FIRMWARE("nvidia/tu117/acr/ucode_unload.bin");
+
  static const struct nvkm_acr_hsf_fwif
  tu102_acr_unload_fwif[] = {
         {  0, nvkm_acr_hsfw_load, &gp108_acr_unload_0 },
@@ -130,6 +136,8 @@ tu102_acr_asb_0 = {
  MODULE_FIRMWARE("nvidia/tu102/acr/ucode_asb.bin");
  MODULE_FIRMWARE("nvidia/tu104/acr/ucode_asb.bin");
  MODULE_FIRMWARE("nvidia/tu106/acr/ucode_asb.bin");
+MODULE_FIRMWARE("nvidia/tu116/acr/ucode_asb.bin");
+MODULE_FIRMWARE("nvidia/tu117/acr/ucode_asb.bin");
  
  static const struct nvkm_acr_hsf_fwif
  tu102_acr_asb_fwif[] = {
@@ -154,6 +162,12 @@ MODULE_FIRMWARE("nvidia/tu104/acr/ucode_ahesasc.bin");
  MODULE_FIRMWARE("nvidia/tu106/acr/bl.bin");
  MODULE_FIRMWARE("nvidia/tu106/acr/ucode_ahesasc.bin");
  
+MODULE_FIRMWARE("nvidia/tu116/acr/bl.bin");
+MODULE_FIRMWARE("nvidia/tu116/acr/ucode_ahesasc.bin");
+
+MODULE_FIRMWARE("nvidia/tu117/acr/bl.bin");
+MODULE_FIRMWARE("nvidia/tu117/acr/ucode_ahesasc.bin");
+
  static const struct nvkm_acr_hsf_fwif
  tu102_acr_ahesasc_fwif[] = {
         {  0, nvkm_acr_hsfw_load, &tu102_acr_ahesasc_0 },
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gv100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gv100.c

index 389bad312bf279bd15102d121728e2a11607473e..10ff5d053f7ea4e0f0645028906da0d6acc2a031 100644 (file)
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gv100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gv100.c
@@ -51,3 +51,5 @@ MODULE_FIRMWARE("nvidia/gv100/nvdec/scrubber.bin");
  MODULE_FIRMWARE("nvidia/tu102/nvdec/scrubber.bin");
  MODULE_FIRMWARE("nvidia/tu104/nvdec/scrubber.bin");
  MODULE_FIRMWARE("nvidia/tu106/nvdec/scrubber.bin");
+MODULE_FIRMWARE("nvidia/tu116/nvdec/scrubber.bin");
+MODULE_FIRMWARE("nvidia/tu117/nvdec/scrubber.bin");
diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c

index 6da59f476aba6b793c238dc8ac65bd74e1647be7..b7a618db3ee223ec491ba66279b6c46e634146a1 100644 (file)
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -166,6 +166,7 @@ panfrost_lookup_bos(struct drm_device *dev,
                         break;
                 }
  
+               atomic_inc(&bo->gpu_usecount);
                 job->mappings[i] = mapping;
         }
  
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem.h b/drivers/gpu/drm/panfrost/panfrost_gem.h

index ca1bc9019600c839d384a6564a2dcf4d8e7be87a..b3517ff9630cb23a8753d066303fb5880adcf984 100644 (file)
--- a/drivers/gpu/drm/panfrost/panfrost_gem.h
+++ b/drivers/gpu/drm/panfrost/panfrost_gem.h
@@ -30,6 +30,12 @@ struct panfrost_gem_object {
                 struct mutex lock;
         } mappings;
  
+       /*
+        * Count the number of jobs referencing this BO so we don't let the
+        * shrinker reclaim this object prematurely.
+        */
+       atomic_t gpu_usecount;
+
         bool noexec             :1;
         bool is_heap            :1;
  };
diff --git a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c

index f5dd7b29bc954909ad5d1508863c4b8a9e5db007..288e46c40673a9d102580030b8a107c2d2af3572 100644 (file)
--- a/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gem_shrinker.c
@@ -41,6 +41,9 @@ static bool panfrost_gem_purge(struct drm_gem_object *obj)
         struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj);
         struct panfrost_gem_object *bo = to_panfrost_bo(obj);
  
+       if (atomic_read(&bo->gpu_usecount))
+               return false;
+
         if (!mutex_trylock(&shmem->pages_lock))
                 return false;
  
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c

index 7c36ec675b73dd464b554f177c22b996e131c930..9a1a72a748e724ca5bac36d4e05c82eb4ccd1f93 100644 (file)
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -269,18 +269,19 @@ static void panfrost_job_cleanup(struct kref *ref)
         dma_fence_put(job->render_done_fence);
  
         if (job->mappings) {
-               for (i = 0; i < job->bo_count; i++)
+               for (i = 0; i < job->bo_count; i++) {
+                       if (!job->mappings[i])
+                               break;
+
+                       atomic_dec(&job->mappings[i]->obj->gpu_usecount);
                         panfrost_gem_mapping_put(job->mappings[i]);
+               }
                 kvfree(job->mappings);
         }
  
         if (job->bos) {
-               struct panfrost_gem_object *bo;
-
-               for (i = 0; i < job->bo_count; i++) {
-                       bo = to_panfrost_bo(job->bos[i]);
+               for (i = 0; i < job->bo_count; i++)
                         drm_gem_object_put_unlocked(job->bos[i]);
-               }
  
                 kvfree(job->bos);
         }
diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c

index 763cfca886a73a7c90d8d6df9e0e4e36c7d2bf7d..3107b0738e401720a5ef3ceaf47ba3ddd6cc6c26 100644 (file)
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -151,7 +151,12 @@ u32 panfrost_mmu_as_get(struct panfrost_device *pfdev, struct panfrost_mmu *mmu)
         as = mmu->as;
         if (as >= 0) {
                 int en = atomic_inc_return(&mmu->as_count);
-               WARN_ON(en >= NUM_JOB_SLOTS);
+
+               /*
+                * AS can be retained by active jobs or a perfcnt context,
+                * hence the '+ 1' here.
+                */
+               WARN_ON(en >= (NUM_JOB_SLOTS + 1));
  
                 list_move(&mmu->list, &pfdev->as_lru_list);
                 goto out;
diff --git a/drivers/gpu/drm/panfrost/panfrost_perfcnt.c b/drivers/gpu/drm/panfrost/panfrost_perfcnt.c

index 684820448be31c7d459a1f139b5003b8cacd1968..6913578d5aa7211de705877f42d8eb210be5289c 100644 (file)
--- a/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
+++ b/drivers/gpu/drm/panfrost/panfrost_perfcnt.c
@@ -73,7 +73,7 @@ static int panfrost_perfcnt_enable_locked(struct panfrost_device *pfdev,
         struct panfrost_file_priv *user = file_priv->driver_priv;
         struct panfrost_perfcnt *perfcnt = pfdev->perfcnt;
         struct drm_gem_shmem_object *bo;
-       u32 cfg;
+       u32 cfg, as;
         int ret;
  
         if (user == perfcnt->user)
@@ -126,12 +126,8 @@ static int panfrost_perfcnt_enable_locked(struct panfrost_device *pfdev,
  
         perfcnt->user = user;
  
-       /*
-        * Always use address space 0 for now.
-        * FIXME: this needs to be updated when we start using different
-        * address space.
-        */
-       cfg = GPU_PERFCNT_CFG_AS(0) |
+       as = panfrost_mmu_as_get(pfdev, perfcnt->mapping->mmu);
+       cfg = GPU_PERFCNT_CFG_AS(as) |
               GPU_PERFCNT_CFG_MODE(GPU_PERFCNT_CFG_MODE_MANUAL);
  
         /*
@@ -195,6 +191,7 @@ static int panfrost_perfcnt_disable_locked(struct panfrost_device *pfdev,
         drm_gem_shmem_vunmap(&perfcnt->mapping->obj->base.base, perfcnt->buf);
         perfcnt->buf = NULL;
         panfrost_gem_close(&perfcnt->mapping->obj->base.base, file_priv);
+       panfrost_mmu_as_put(pfdev, perfcnt->mapping->mmu);
         panfrost_gem_mapping_put(perfcnt->mapping);
         perfcnt->mapping = NULL;
         pm_runtime_mark_last_busy(pfdev->dev);
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c

index fd74e261118595e8fb4080764ee93eb961935319..8696af1ee14dc7705b9a98f6869d69526d6a1ddd 100644 (file)
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -37,6 +37,7 @@
  #include <linux/vga_switcheroo.h>
  #include <linux/mmu_notifier.h>
  
+#include <drm/drm_agpsupport.h>
  #include <drm/drm_crtc_helper.h>
  #include <drm/drm_drv.h>
  #include <drm/drm_fb_helper.h>
@@ -325,6 +326,7 @@ static int radeon_pci_probe(struct pci_dev *pdev,
                             const struct pci_device_id *ent)
  {
         unsigned long flags = 0;
+       struct drm_device *dev;
         int ret;
  
         if (!ent)
@@ -365,7 +367,44 @@ static int radeon_pci_probe(struct pci_dev *pdev,
         if (ret)
                 return ret;
  
-       return drm_get_pci_dev(pdev, ent, &kms_driver);
+       dev = drm_dev_alloc(&kms_driver, &pdev->dev);
+       if (IS_ERR(dev))
+               return PTR_ERR(dev);
+
+       ret = pci_enable_device(pdev);
+       if (ret)
+               goto err_free;
+
+       dev->pdev = pdev;
+#ifdef __alpha__
+       dev->hose = pdev->sysdata;
+#endif
+
+       pci_set_drvdata(pdev, dev);
+
+       if (pci_find_capability(dev->pdev, PCI_CAP_ID_AGP))
+               dev->agp = drm_agp_init(dev);
+       if (dev->agp) {
+               dev->agp->agp_mtrr = arch_phys_wc_add(
+                       dev->agp->agp_info.aper_base,
+                       dev->agp->agp_info.aper_size *
+                       1024 * 1024);
+       }
+
+       ret = drm_dev_register(dev, ent->driver_data);
+       if (ret)
+               goto err_agp;
+
+       return 0;
+
+err_agp:
+       if (dev->agp)
+               arch_phys_wc_del(dev->agp->agp_mtrr);
+       kfree(dev->agp);
+       pci_disable_device(pdev);
+err_free:
+       drm_dev_put(dev);
+       return ret;
  }
  
  static void
@@ -575,7 +614,7 @@ radeon_get_crtc_scanout_position(struct drm_device *dev, unsigned int pipe,
  
  static struct drm_driver kms_driver = {
         .driver_features =
-           DRIVER_USE_AGP | DRIVER_GEM | DRIVER_RENDER,
+           DRIVER_GEM | DRIVER_RENDER,
         .load = radeon_driver_load_kms,
         .open = radeon_driver_open_kms,
         .postclose = radeon_driver_postclose_kms,
diff --git a/drivers/gpu/drm/radeon/radeon_kms.c b/drivers/gpu/drm/radeon/radeon_kms.c

index d24f23a8165602a547608777a3cc004e7cc9cc22..dd2f19b8022bd2c215fed2a257e90307b666ea86 100644 (file)
--- a/drivers/gpu/drm/radeon/radeon_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_kms.c
@@ -32,6 +32,7 @@
  #include <linux/uaccess.h>
  #include <linux/vga_switcheroo.h>
  
+#include <drm/drm_agpsupport.h>
  #include <drm/drm_fb_helper.h>
  #include <drm/drm_file.h>
  #include <drm/drm_ioctl.h>
@@ -77,6 +78,11 @@ void radeon_driver_unload_kms(struct drm_device *dev)
         radeon_modeset_fini(rdev);
         radeon_device_fini(rdev);
  
+       if (dev->agp)
+               arch_phys_wc_del(dev->agp->agp_mtrr);
+       kfree(dev->agp);
+       dev->agp = NULL;
+
  done_free:
         kfree(rdev);
         dev->dev_private = NULL;
diff --git a/drivers/gpu/drm/selftests/drm_cmdline_selftests.h b/drivers/gpu/drm/selftests/drm_cmdline_selftests.h

index ceac7af9a172ddf4842e4f2f0500d2829427d488..29e367db6118ba125ec9ad23fc0cba393b2a7ca2 100644 (file)
--- a/drivers/gpu/drm/selftests/drm_cmdline_selftests.h
+++ b/drivers/gpu/drm/selftests/drm_cmdline_selftests.h
@@ -53,6 +53,7 @@ cmdline_test(drm_cmdline_test_rotate_0)
  cmdline_test(drm_cmdline_test_rotate_90)
  cmdline_test(drm_cmdline_test_rotate_180)
  cmdline_test(drm_cmdline_test_rotate_270)
+cmdline_test(drm_cmdline_test_rotate_multiple)
  cmdline_test(drm_cmdline_test_rotate_invalid_val)
  cmdline_test(drm_cmdline_test_rotate_truncated)
  cmdline_test(drm_cmdline_test_hmirror)
diff --git a/drivers/gpu/drm/selftests/test-drm_cmdline_parser.c b/drivers/gpu/drm/selftests/test-drm_cmdline_parser.c

index 520f3e66a384a27f694a92f823a489ca439a88bf..d96cd890def6eafcd8773e9180928255072bdd62 100644 (file)
--- a/drivers/gpu/drm/selftests/test-drm_cmdline_parser.c
+++ b/drivers/gpu/drm/selftests/test-drm_cmdline_parser.c
@@ -856,6 +856,17 @@ static int drm_cmdline_test_rotate_270(void *ignored)
         return 0;
  }
  
+static int drm_cmdline_test_rotate_multiple(void *ignored)
+{
+       struct drm_cmdline_mode mode = { };
+
+       FAIL_ON(drm_mode_parse_command_line_for_connector("720x480,rotate=0,rotate=90",
+                                                         &no_connector,
+                                                         &mode));
+
+       return 0;
+}
+
  static int drm_cmdline_test_rotate_invalid_val(void *ignored)
  {
         struct drm_cmdline_mode mode = { };
@@ -888,7 +899,7 @@ static int drm_cmdline_test_hmirror(void *ignored)
         FAIL_ON(!mode.specified);
         FAIL_ON(mode.xres != 720);
         FAIL_ON(mode.yres != 480);
-       FAIL_ON(mode.rotation_reflection != DRM_MODE_REFLECT_X);
+       FAIL_ON(mode.rotation_reflection != (DRM_MODE_ROTATE_0 | DRM_MODE_REFLECT_X));
  
         FAIL_ON(mode.refresh_specified);
  
@@ -913,7 +924,7 @@ static int drm_cmdline_test_vmirror(void *ignored)
         FAIL_ON(!mode.specified);
         FAIL_ON(mode.xres != 720);
         FAIL_ON(mode.yres != 480);
-       FAIL_ON(mode.rotation_reflection != DRM_MODE_REFLECT_Y);
+       FAIL_ON(mode.rotation_reflection != (DRM_MODE_ROTATE_0 | DRM_MODE_REFLECT_Y));
  
         FAIL_ON(mode.refresh_specified);
  
diff --git a/drivers/gpu/drm/sun4i/sun4i_drv.c b/drivers/gpu/drm/sun4i/sun4i_drv.c

index 5ae67d526b1de8b753eb724496c64d84706ed1b0..328272ff77d84d737c83327b8aa0ab30268190db 100644 (file)
--- a/drivers/gpu/drm/sun4i/sun4i_drv.c
+++ b/drivers/gpu/drm/sun4i/sun4i_drv.c
@@ -85,7 +85,6 @@ static int sun4i_drv_bind(struct device *dev)
         }
  
         drm_mode_config_init(drm);
-       drm->mode_config.allow_fb_modifiers = true;
  
         ret = component_bind_all(drm->dev, drm);
         if (ret) {
diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c

index 5bd60ded3d8151d2e63b85d7c7afbb626a689665..909eba43664a28f857070ec7090482f53727416d 100644 (file)
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -196,9 +196,10 @@ static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
                 return ERR_CAST(obj);
  
         ret = drm_gem_handle_create(file, &obj->base, handle);
-       drm_gem_object_put_unlocked(&obj->base);
-       if (ret)
+       if (ret) {
+               drm_gem_object_put_unlocked(&obj->base);
                 return ERR_PTR(ret);
+       }
  
         return &obj->base;
  }
@@ -221,7 +222,9 @@ static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
         args->size = gem_object->size;
         args->pitch = pitch;
  
-       DRM_DEBUG("Created object of size %lld\n", size);
+       drm_gem_object_put_unlocked(gem_object);
+
+       DRM_DEBUG("Created object of size %llu\n", args->size);
  
         return 0;
  }
diff --git a/drivers/hid/hid-alps.c b/drivers/hid/hid-alps.c

index ae79a7c667372e6e54b8302565870c3f24b2ce85..fa704153cb00d5f7fa7d0185acd1f953fdd37754 100644 (file)
--- a/drivers/hid/hid-alps.c
+++ b/drivers/hid/hid-alps.c
@@ -730,7 +730,7 @@ static int alps_input_configured(struct hid_device *hdev, struct hid_input *hi)
         if (data->has_sp) {
                 input2 = input_allocate_device();
                 if (!input2) {
-                       input_free_device(input2);
+                       ret = -ENOMEM;
                         goto exit;
                 }
  
diff --git a/drivers/hid/hid-apple.c b/drivers/hid/hid-apple.c

index 6ac8becc2372efdd303ba29d14a246e9f5290239..d732d1d10cafb7bbafbf27ec59ff17bd5aae2c94 100644 (file)
--- a/drivers/hid/hid-apple.c
+++ b/drivers/hid/hid-apple.c
@@ -340,7 +340,8 @@ static int apple_input_mapping(struct hid_device *hdev, struct hid_input *hi,
                 unsigned long **bit, int *max)
  {
         if (usage->hid == (HID_UP_CUSTOM | 0x0003) ||
-                       usage->hid == (HID_UP_MSVENDOR | 0x0003)) {
+                       usage->hid == (HID_UP_MSVENDOR | 0x0003) ||
+                       usage->hid == (HID_UP_HPVENDOR2 | 0x0003)) {
                 /* The fn key on Apple USB keyboards */
                 set_bit(EV_REP, hi->input->evbit);
                 hid_map_usage_clear(hi, usage, bit, max, EV_KEY, KEY_FN);
diff --git a/drivers/hid/hid-bigbenff.c b/drivers/hid/hid-bigbenff.c

index 3f6abd190df43ef17cb6080505fe7761c38324fe..db6da21ade06315c457d087447cecf563eac8c55 100644 (file)
--- a/drivers/hid/hid-bigbenff.c
+++ b/drivers/hid/hid-bigbenff.c
@@ -174,6 +174,7 @@ static __u8 pid0902_rdesc_fixed[] = {
  struct bigben_device {
         struct hid_device *hid;
         struct hid_report *report;
+       bool removed;
         u8 led_state;         /* LED1 = 1 .. LED4 = 8 */
         u8 right_motor_on;    /* right motor off/on 0/1 */
         u8 left_motor_force;  /* left motor force 0-255 */
@@ -190,6 +191,9 @@ static void bigben_worker(struct work_struct *work)
                 struct bigben_device, worker);
         struct hid_field *report_field = bigben->report->field[0];
  
+       if (bigben->removed)
+               return;
+
         if (bigben->work_led) {
                 bigben->work_led = false;
                 report_field->value[0] = 0x01; /* 1 = led message */
@@ -220,10 +224,16 @@ static void bigben_worker(struct work_struct *work)
  static int hid_bigben_play_effect(struct input_dev *dev, void *data,
                          struct ff_effect *effect)
  {
-       struct bigben_device *bigben = data;
+       struct hid_device *hid = input_get_drvdata(dev);
+       struct bigben_device *bigben = hid_get_drvdata(hid);
         u8 right_motor_on;
         u8 left_motor_force;
  
+       if (!bigben) {
+               hid_err(hid, "no device data\n");
+               return 0;
+       }
+
         if (effect->type != FF_RUMBLE)
                 return 0;
  
@@ -298,8 +308,8 @@ static void bigben_remove(struct hid_device *hid)
  {
         struct bigben_device *bigben = hid_get_drvdata(hid);
  
+       bigben->removed = true;
         cancel_work_sync(&bigben->worker);
-       hid_hw_close(hid);
         hid_hw_stop(hid);
  }
  
@@ -319,6 +329,7 @@ static int bigben_probe(struct hid_device *hid,
                 return -ENOMEM;
         hid_set_drvdata(hid, bigben);
         bigben->hid = hid;
+       bigben->removed = false;
  
         error = hid_parse(hid);
         if (error) {
@@ -341,10 +352,10 @@ static int bigben_probe(struct hid_device *hid,
  
         INIT_WORK(&bigben->worker, bigben_worker);
  
-       error = input_ff_create_memless(hidinput->input, bigben,
+       error = input_ff_create_memless(hidinput->input, NULL,
                 hid_bigben_play_effect);
         if (error)
-               return error;
+               goto error_hw_stop;
  
         name_sz = strlen(dev_name(&hid->dev)) + strlen(":red:bigben#") + 1;
  
@@ -354,8 +365,10 @@ static int bigben_probe(struct hid_device *hid,
                         sizeof(struct led_classdev) + name_sz,
                         GFP_KERNEL
                 );
-               if (!led)
-                       return -ENOMEM;
+               if (!led) {
+                       error = -ENOMEM;
+                       goto error_hw_stop;
+               }
                 name = (void *)(&led[1]);
                 snprintf(name, name_sz,
                         "%s:red:bigben%d",
@@ -369,7 +382,7 @@ static int bigben_probe(struct hid_device *hid,
                 bigben->leds[n] = led;
                 error = devm_led_classdev_register(&hid->dev, led);
                 if (error)
-                       return error;
+                       goto error_hw_stop;
         }
  
         /* initial state: LED1 is on, no rumble effect */
@@ -383,6 +396,10 @@ static int bigben_probe(struct hid_device *hid,
         hid_info(hid, "LED and force feedback support for BigBen gamepad\n");
  
         return 0;
+
+error_hw_stop:
+       hid_hw_stop(hid);
+       return error;
  }
  
  static __u8 *bigben_report_fixup(struct hid_device *hid, __u8 *rdesc,
diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c

index 851fe54ea59e7c2636ab204372c49b749a136014..359616e3efbbb244e633b4a37ae6361c30abf19f 100644 (file)
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -1741,7 +1741,9 @@ int hid_report_raw_event(struct hid_device *hid, int type, u8 *data, u32 size,
  
         rsize = ((report->size - 1) >> 3) + 1;
  
-       if (rsize > HID_MAX_BUFFER_SIZE)
+       if (report_enum->numbered && rsize >= HID_MAX_BUFFER_SIZE)
+               rsize = HID_MAX_BUFFER_SIZE - 1;
+       else if (rsize > HID_MAX_BUFFER_SIZE)
                 rsize = HID_MAX_BUFFER_SIZE;
  
         if (csize < rsize) {
diff --git a/drivers/hid/hid-ite.c b/drivers/hid/hid-ite.c

index c436e12feb23315f141362dccc38c2c549c8d9f9..6c55682c597409d1fcdbfbaad7a4974010b50857 100644 (file)
--- a/drivers/hid/hid-ite.c
+++ b/drivers/hid/hid-ite.c
@@ -41,8 +41,9 @@ static const struct hid_device_id ite_devices[] = {
         { HID_USB_DEVICE(USB_VENDOR_ID_ITE, USB_DEVICE_ID_ITE8595) },
         { HID_USB_DEVICE(USB_VENDOR_ID_258A, USB_DEVICE_ID_258A_6A88) },
         /* ITE8595 USB kbd ctlr, with Synaptics touchpad connected to it. */
-       { HID_USB_DEVICE(USB_VENDOR_ID_SYNAPTICS,
-                        USB_DEVICE_ID_SYNAPTICS_ACER_SWITCH5_012) },
+       { HID_DEVICE(BUS_USB, HID_GROUP_GENERIC,
+                    USB_VENDOR_ID_SYNAPTICS,
+                    USB_DEVICE_ID_SYNAPTICS_ACER_SWITCH5_012) },
         { }
  };
  MODULE_DEVICE_TABLE(hid, ite_devices);
diff --git a/drivers/hid/hid-logitech-hidpp.c b/drivers/hid/hid-logitech-hidpp.c

index 70e1cb928bf038876cd603e99302ea7e5687c426..094f4f1b6555b49315d44632a70cb8c6205f24db 100644 (file)
--- a/drivers/hid/hid-logitech-hidpp.c
+++ b/drivers/hid/hid-logitech-hidpp.c
@@ -1256,36 +1256,35 @@ static int hidpp20_battery_map_status_voltage(u8 data[3], int *voltage,
  {
         int status;
  
-       long charge_sts = (long)data[2];
+       long flags = (long) data[2];
  
-       *level = POWER_SUPPLY_CAPACITY_LEVEL_UNKNOWN;
-       switch (data[2] & 0xe0) {
-       case 0x00:
-               status = POWER_SUPPLY_STATUS_CHARGING;
-               break;
-       case 0x20:
-               status = POWER_SUPPLY_STATUS_FULL;
-               *level = POWER_SUPPLY_CAPACITY_LEVEL_FULL;
-               break;
-       case 0x40:
+       if (flags & 0x80)
+               switch (flags & 0x07) {
+               case 0:
+                       status = POWER_SUPPLY_STATUS_CHARGING;
+                       break;
+               case 1:
+                       status = POWER_SUPPLY_STATUS_FULL;
+                       *level = POWER_SUPPLY_CAPACITY_LEVEL_FULL;
+                       break;
+               case 2:
+                       status = POWER_SUPPLY_STATUS_NOT_CHARGING;
+                       break;
+               default:
+                       status = POWER_SUPPLY_STATUS_UNKNOWN;
+                       break;
+               }
+       else
                 status = POWER_SUPPLY_STATUS_DISCHARGING;
-               break;
-       case 0xe0:
-               status = POWER_SUPPLY_STATUS_NOT_CHARGING;
-               break;
-       default:
-               status = POWER_SUPPLY_STATUS_UNKNOWN;
-       }
  
         *charge_type = POWER_SUPPLY_CHARGE_TYPE_STANDARD;
-       if (test_bit(3, &charge_sts)) {
+       if (test_bit(3, &flags)) {
                 *charge_type = POWER_SUPPLY_CHARGE_TYPE_FAST;
         }
-       if (test_bit(4, &charge_sts)) {
+       if (test_bit(4, &flags)) {
                 *charge_type = POWER_SUPPLY_CHARGE_TYPE_TRICKLE;
         }
-
-       if (test_bit(5, &charge_sts)) {
+       if (test_bit(5, &flags)) {
                 *level = POWER_SUPPLY_CAPACITY_LEVEL_CRITICAL;
         }
  
diff --git a/drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c b/drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c

index d31ea82b84c173033dd1a6582122ec9197671fbc..a66f08041a1aa105d1605210c3f70b71d4afbcc3 100644 (file)
--- a/drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c
+++ b/drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c
@@ -341,6 +341,14 @@ static const struct dmi_system_id i2c_hid_dmi_desc_override_table[] = {
                 },
                 .driver_data = (void *)&sipodev_desc
         },
+       {
+               .ident = "Trekstor SURFBOOK E11B",
+               .matches = {
+                       DMI_EXACT_MATCH(DMI_SYS_VENDOR, "TREKSTOR"),
+                       DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "SURFBOOK E11B"),
+               },
+               .driver_data = (void *)&sipodev_desc
+       },
         {
                 .ident = "Direkt-Tek DTLAPY116-2",
                 .matches = {
diff --git a/drivers/hid/usbhid/hiddev.c b/drivers/hid/usbhid/hiddev.c

index a970b809d778c3c9a74702abe70c9f1f976a06d9..4140dea693e90988d80bac6129458499fb2d03af 100644 (file)
--- a/drivers/hid/usbhid/hiddev.c
+++ b/drivers/hid/usbhid/hiddev.c
@@ -932,9 +932,9 @@ void hiddev_disconnect(struct hid_device *hid)
         hiddev->exist = 0;
  
         if (hiddev->open) {
-               mutex_unlock(&hiddev->existancelock);
                 hid_hw_close(hiddev->hid);
                 wake_up_interruptible(&hiddev->wait);
+               mutex_unlock(&hiddev->existancelock);
         } else {
                 mutex_unlock(&hiddev->existancelock);
                 kfree(hiddev);
diff --git a/drivers/hwmon/acpi_power_meter.c b/drivers/hwmon/acpi_power_meter.c

index 4cf25458f0b9515e419a6f0b88047bf35809a2e5..0db8ef4fd6e18b7464f3c8e4b6c0de741f8e0998 100644 (file)
--- a/drivers/hwmon/acpi_power_meter.c
+++ b/drivers/hwmon/acpi_power_meter.c
@@ -355,7 +355,9 @@ static ssize_t show_str(struct device *dev,
         struct acpi_device *acpi_dev = to_acpi_device(dev);
         struct acpi_power_meter_resource *resource = acpi_dev->driver_data;
         acpi_string val;
+       int ret;
  
+       mutex_lock(&resource->lock);
         switch (attr->index) {
         case 0:
                 val = resource->model_number;
@@ -372,8 +374,9 @@ static ssize_t show_str(struct device *dev,
                 val = "";
                 break;
         }
-
-       return sprintf(buf, "%s\n", val);
+       ret = sprintf(buf, "%s\n", val);
+       mutex_unlock(&resource->lock);
+       return ret;
  }
  
  static ssize_t show_val(struct device *dev,
@@ -817,11 +820,12 @@ static void acpi_power_meter_notify(struct acpi_device *device, u32 event)
  
         resource = acpi_driver_data(device);
  
-       mutex_lock(&resource->lock);
         switch (event) {
         case METER_NOTIFY_CONFIG:
+               mutex_lock(&resource->lock);
                 free_capabilities(resource);
                 res = read_capabilities(resource);
+               mutex_unlock(&resource->lock);
                 if (res)
                         break;
  
@@ -830,15 +834,12 @@ static void acpi_power_meter_notify(struct acpi_device *device, u32 event)
                 break;
         case METER_NOTIFY_TRIP:
                 sysfs_notify(&device->dev.kobj, NULL, POWER_AVERAGE_NAME);
-               update_meter(resource);
                 break;
         case METER_NOTIFY_CAP:
                 sysfs_notify(&device->dev.kobj, NULL, POWER_CAP_NAME);
-               update_cap(resource);
                 break;
         case METER_NOTIFY_INTERVAL:
                 sysfs_notify(&device->dev.kobj, NULL, POWER_AVG_INTERVAL_NAME);
-               update_avg_interval(resource);
                 break;
         case METER_NOTIFY_CAPPING:
                 sysfs_notify(&device->dev.kobj, NULL, POWER_ALARM_NAME);
@@ -848,7 +849,6 @@ static void acpi_power_meter_notify(struct acpi_device *device, u32 event)
                 WARN(1, "Unexpected event %d\n", event);
                 break;
         }
-       mutex_unlock(&resource->lock);
  
         acpi_bus_generate_netlink_event(ACPI_POWER_METER_CLASS,
                                         dev_name(&device->dev), event, 0);
@@ -912,8 +912,8 @@ static int acpi_power_meter_remove(struct acpi_device *device)
         resource = acpi_driver_data(device);
         hwmon_device_unregister(resource->hwmon_dev);
  
-       free_capabilities(resource);
         remove_attrs(resource);
+       free_capabilities(resource);
  
         kfree(resource);
         return 0;
diff --git a/drivers/hwmon/pmbus/ltc2978.c b/drivers/hwmon/pmbus/ltc2978.c

index f01f4887fb2e6b92d416d17a8ea75233bdd1ca91..a91ed01abb68050e4c6ffb231c21c5ab8b7585f1 100644 (file)
--- a/drivers/hwmon/pmbus/ltc2978.c
+++ b/drivers/hwmon/pmbus/ltc2978.c
@@ -82,8 +82,8 @@ enum chips { ltc2974, ltc2975, ltc2977, ltc2978, ltc2980, ltc3880, ltc3882,
  
  #define LTC_POLL_TIMEOUT               100     /* in milli-seconds */
  
-#define LTC_NOT_BUSY                   BIT(5)
-#define LTC_NOT_PENDING                        BIT(4)
+#define LTC_NOT_BUSY                   BIT(6)
+#define LTC_NOT_PENDING                        BIT(5)
  
  /*
   * LTC2978 clears peak data whenever the CLEAR_FAULTS command is executed, which
diff --git a/drivers/hwmon/pmbus/xdpe12284.c b/drivers/hwmon/pmbus/xdpe12284.c

index 3d47806ff4d3f4704aeaf0e59099bec539b611e8..ecd9b65627ecdd57ded7dc383bccf6081da9559f 100644 (file)
--- a/drivers/hwmon/pmbus/xdpe12284.c
+++ b/drivers/hwmon/pmbus/xdpe12284.c
@@ -94,8 +94,8 @@ static const struct i2c_device_id xdpe122_id[] = {
  MODULE_DEVICE_TABLE(i2c, xdpe122_id);
  
  static const struct of_device_id __maybe_unused xdpe122_of_match[] = {
-       {.compatible = "infineon, xdpe12254"},
-       {.compatible = "infineon, xdpe12284"},
+       {.compatible = "infineon,xdpe12254"},
+       {.compatible = "infineon,xdpe12284"},
         {}
  };
  MODULE_DEVICE_TABLE(of, xdpe122_of_match);
diff --git a/drivers/hwmon/w83627ehf.c b/drivers/hwmon/w83627ehf.c

index 7ffadc2da57b537d2638f0829fa778318546a1e3..5a5120121e50725c49a39356ab613b72beb58c97 100644 (file)
--- a/drivers/hwmon/w83627ehf.c
+++ b/drivers/hwmon/w83627ehf.c
@@ -1346,8 +1346,13 @@ w83627ehf_is_visible(const void *drvdata, enum hwmon_sensor_types type,
                 /* channel 0.., name 1.. */
                 if (!(data->have_temp & (1 << channel)))
                         return 0;
-               if (attr == hwmon_temp_input || attr == hwmon_temp_label)
+               if (attr == hwmon_temp_input)
                         return 0444;
+               if (attr == hwmon_temp_label) {
+                       if (data->temp_label)
+                               return 0444;
+                       return 0;
+               }
                 if (channel == 2 && data->temp3_val_only)
                         return 0;
                 if (attr == hwmon_temp_max) {
diff --git a/drivers/i2c/busses/i2c-altera.c b/drivers/i2c/busses/i2c-altera.c

index 5255d3755411b29285b7170aca298b29e4f4a4e9..1de23b4f3809c507f8f23eac1b7e22641963f537 100644 (file)
--- a/drivers/i2c/busses/i2c-altera.c
+++ b/drivers/i2c/busses/i2c-altera.c
@@ -171,7 +171,7 @@ static void altr_i2c_init(struct altr_i2c_dev *idev)
         /* SCL Low Time */
         writel(t_low, idev->base + ALTR_I2C_SCL_LOW);
         /* SDA Hold Time, 300ns */
-       writel(div_u64(300 * clk_mhz, 1000), idev->base + ALTR_I2C_SDA_HOLD);
+       writel(3 * clk_mhz / 10, idev->base + ALTR_I2C_SDA_HOLD);
  
         /* Mask all master interrupt bits */
         altr_i2c_int_enable(idev, ALTR_I2C_ALL_IRQ, false);
diff --git a/drivers/i2c/busses/i2c-jz4780.c b/drivers/i2c/busses/i2c-jz4780.c

index 16a67a64284a04c2d58192e1e763a63f3d13f51f..b426fc9569387d431a955fe3f274f87f735c11f9 100644 (file)
--- a/drivers/i2c/busses/i2c-jz4780.c
+++ b/drivers/i2c/busses/i2c-jz4780.c
@@ -78,25 +78,6 @@
  
  #define X1000_I2C_DC_STOP              BIT(9)
  
-static const char * const jz4780_i2c_abrt_src[] = {
-       "ABRT_7B_ADDR_NOACK",
-       "ABRT_10ADDR1_NOACK",
-       "ABRT_10ADDR2_NOACK",
-       "ABRT_XDATA_NOACK",
-       "ABRT_GCALL_NOACK",
-       "ABRT_GCALL_READ",
-       "ABRT_HS_ACKD",
-       "SBYTE_ACKDET",
-       "ABRT_HS_NORSTRT",
-       "SBYTE_NORSTRT",
-       "ABRT_10B_RD_NORSTRT",
-       "ABRT_MASTER_DIS",
-       "ARB_LOST",
-       "SLVFLUSH_TXFIFO",
-       "SLV_ARBLOST",
-       "SLVRD_INTX",
-};
-
  #define JZ4780_I2C_INTST_IGC           BIT(11)
  #define JZ4780_I2C_INTST_ISTT          BIT(10)
  #define JZ4780_I2C_INTST_ISTP          BIT(9)
@@ -576,21 +557,8 @@ done:
  
  static void jz4780_i2c_txabrt(struct jz4780_i2c *i2c, int src)
  {
-       int i;
-
-       dev_err(&i2c->adap.dev, "txabrt: 0x%08x\n", src);
-       dev_err(&i2c->adap.dev, "device addr=%x\n",
-               jz4780_i2c_readw(i2c, JZ4780_I2C_TAR));
-       dev_err(&i2c->adap.dev, "send cmd count:%d  %d\n",
-               i2c->cmd, i2c->cmd_buf[i2c->cmd]);
-       dev_err(&i2c->adap.dev, "receive data count:%d  %d\n",
-               i2c->cmd, i2c->data_buf[i2c->cmd]);
-
-       for (i = 0; i < 16; i++) {
-               if (src & BIT(i))
-                       dev_dbg(&i2c->adap.dev, "I2C TXABRT[%d]=%s\n",
-                               i, jz4780_i2c_abrt_src[i]);
-       }
+       dev_dbg(&i2c->adap.dev, "txabrt: 0x%08x, cmd: %d, send: %d, recv: %d\n",
+               src, i2c->cmd, i2c->cmd_buf[i2c->cmd], i2c->data_buf[i2c->cmd]);
  }
  
  static inline int jz4780_i2c_xfer_read(struct jz4780_i2c *i2c,
diff --git a/drivers/ide/ide-gd.c b/drivers/ide/ide-gd.c

index 1bb99b5563930c810855d2fe2f23e83e8d619147..05c26986637ba3ad8d29d46387d8cb9f45219402 100644 (file)
--- a/drivers/ide/ide-gd.c
+++ b/drivers/ide/ide-gd.c
@@ -361,7 +361,7 @@ static const struct block_device_operations ide_gd_ops = {
         .release                = ide_gd_release,
         .ioctl                  = ide_gd_ioctl,
  #ifdef CONFIG_COMPAT
-       .ioctl                  = ide_gd_compat_ioctl,
+       .compat_ioctl           = ide_gd_compat_ioctl,
  #endif
         .getgeo                 = ide_gd_getgeo,
         .check_events           = ide_gd_check_events,
diff --git a/drivers/infiniband/core/security.c b/drivers/infiniband/core/security.c

index 6eb6d2717ca5b2b392acc723d82ca5efcac1d5ce..2b4d80393bd0dcc8f166effc34709d1de6fbef7c 100644 (file)
--- a/drivers/infiniband/core/security.c
+++ b/drivers/infiniband/core/security.c
@@ -339,22 +339,16 @@ static struct ib_ports_pkeys *get_new_pps(const struct ib_qp *qp,
         if (!new_pps)
                 return NULL;
  
-       if (qp_attr_mask & (IB_QP_PKEY_INDEX | IB_QP_PORT)) {
-               if (!qp_pps) {
-                       new_pps->main.port_num = qp_attr->port_num;
-                       new_pps->main.pkey_index = qp_attr->pkey_index;
-               } else {
-                       new_pps->main.port_num = (qp_attr_mask & IB_QP_PORT) ?
-                                                 qp_attr->port_num :
-                                                 qp_pps->main.port_num;
-
-                       new_pps->main.pkey_index =
-                                       (qp_attr_mask & IB_QP_PKEY_INDEX) ?
-                                        qp_attr->pkey_index :
-                                        qp_pps->main.pkey_index;
-               }
+       if (qp_attr_mask & IB_QP_PORT)
+               new_pps->main.port_num =
+                       (qp_pps) ? qp_pps->main.port_num : qp_attr->port_num;
+       if (qp_attr_mask & IB_QP_PKEY_INDEX)
+               new_pps->main.pkey_index = (qp_pps) ? qp_pps->main.pkey_index :
+                                                     qp_attr->pkey_index;
+       if ((qp_attr_mask & IB_QP_PKEY_INDEX) && (qp_attr_mask & IB_QP_PORT))
                 new_pps->main.state = IB_PORT_PKEY_VALID;
-       } else if (qp_pps) {
+
+       if (!(qp_attr_mask & (IB_QP_PKEY_INDEX || IB_QP_PORT)) && qp_pps) {
                 new_pps->main.port_num = qp_pps->main.port_num;
                 new_pps->main.pkey_index = qp_pps->main.pkey_index;
                 if (qp_pps->main.state != IB_PORT_PKEY_NOT_VALID)
diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c

index d1407fa378e832bdfe45ba4c4b35645bd59b3a89..1235ffb2389b1c00feed521311fb54d019337c4a 100644 (file)
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -1312,6 +1312,9 @@ static void ib_umad_kill_port(struct ib_umad_port *port)
         struct ib_umad_file *file;
         int id;
  
+       cdev_device_del(&port->sm_cdev, &port->sm_dev);
+       cdev_device_del(&port->cdev, &port->dev);
+
         mutex_lock(&port->file_mutex);
  
         /* Mark ib_dev NULL and block ioctl or other file ops to progress
@@ -1331,8 +1334,6 @@ static void ib_umad_kill_port(struct ib_umad_port *port)
  
         mutex_unlock(&port->file_mutex);
  
-       cdev_device_del(&port->sm_cdev, &port->sm_dev);
-       cdev_device_del(&port->cdev, &port->dev);
         ida_free(&umad_ida, port->dev_num);
  
         /* balances device_initialize() */
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c

index c8693f5231dd7dcfe678469662b628863df4a416..025933752e1da5235638c6080f1548c1894376c1 100644 (file)
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -2745,12 +2745,6 @@ static int kern_spec_to_ib_spec_action(struct uverbs_attr_bundle *attrs,
         return 0;
  }
  
-static size_t kern_spec_filter_sz(const struct ib_uverbs_flow_spec_hdr *spec)
-{
-       /* Returns user space filter size, includes padding */
-       return (spec->size - sizeof(struct ib_uverbs_flow_spec_hdr)) / 2;
-}
-
  static ssize_t spec_filter_size(const void *kern_spec_filter, u16 kern_filter_size,
                                 u16 ib_real_filter_sz)
  {
@@ -2894,11 +2888,16 @@ int ib_uverbs_kern_spec_to_ib_spec_filter(enum ib_flow_spec_type type,
  static int kern_spec_to_ib_spec_filter(struct ib_uverbs_flow_spec *kern_spec,
                                        union ib_flow_spec *ib_spec)
  {
-       ssize_t kern_filter_sz;
+       size_t kern_filter_sz;
         void *kern_spec_mask;
         void *kern_spec_val;
  
-       kern_filter_sz = kern_spec_filter_sz(&kern_spec->hdr);
+       if (check_sub_overflow((size_t)kern_spec->hdr.size,
+                              sizeof(struct ib_uverbs_flow_spec_hdr),
+                              &kern_filter_sz))
+               return -EINVAL;
+
+       kern_filter_sz /= 2;
  
         kern_spec_val = (void *)kern_spec +
                 sizeof(struct ib_uverbs_flow_spec_hdr);
diff --git a/drivers/infiniband/core/uverbs_std_types.c b/drivers/infiniband/core/uverbs_std_types.c

index 994d8744b246921ea1c3f0b89a159d621acebecf..3abfc63225cbfa24aea4c991d808b355a21eb0f9 100644 (file)
--- a/drivers/infiniband/core/uverbs_std_types.c
+++ b/drivers/infiniband/core/uverbs_std_types.c
@@ -220,6 +220,7 @@ void ib_uverbs_free_event_queue(struct ib_uverbs_event_queue *event_queue)
         list_for_each_entry_safe(entry, tmp, &event_queue->event_list, list) {
                 if (entry->counter)
                         list_del(&entry->obj_list);
+               list_del(&entry->list);
                 kfree(entry);
         }
         spin_unlock_irq(&event_queue->lock);
diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c

index ee1182f9b627e5db1b6192c836e6cbe1a5bf9157..d69dece3b1d541ad3b834cc9ea128de7c9f20168 100644 (file)
--- a/drivers/infiniband/hw/cxgb4/cm.c
+++ b/drivers/infiniband/hw/cxgb4/cm.c
@@ -3036,6 +3036,10 @@ static int terminate(struct c4iw_dev *dev, struct sk_buff *skb)
                                        C4IW_QP_ATTR_NEXT_STATE, &attrs, 1);
                 }
  
+               /* As per draft-hilland-iwarp-verbs-v1.0, sec 6.2.3,
+                * when entering the TERM state the RNIC MUST initiate a CLOSE.
+                */
+               c4iw_ep_disconnect(ep, 1, GFP_KERNEL);
                 c4iw_put_ep(&ep->com);
         } else
                 pr_warn("TERM received tid %u no ep/qp\n", tid);
diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c

index bbcac539777a2f62281963bf988465c6add06d58..89ac2f9ae6dd8219cf24d70b5f046c9ff9d255a5 100644 (file)
--- a/drivers/infiniband/hw/cxgb4/qp.c
+++ b/drivers/infiniband/hw/cxgb4/qp.c
@@ -1948,10 +1948,10 @@ int c4iw_modify_qp(struct c4iw_dev *rhp, struct c4iw_qp *qhp,
                         qhp->attr.layer_etype = attrs->layer_etype;
                         qhp->attr.ecode = attrs->ecode;
                         ep = qhp->ep;
-                       c4iw_get_ep(&ep->com);
-                       disconnect = 1;
                         if (!internal) {
+                               c4iw_get_ep(&ep->com);
                                 terminate = 1;
+                               disconnect = 1;
                         } else {
                                 terminate = qhp->attr.send_term;
                                 ret = rdma_fini(rhp, qhp, ep);
diff --git a/drivers/infiniband/hw/hfi1/affinity.c b/drivers/infiniband/hw/hfi1/affinity.c

index c142b23bb40183d61bc009a1904b65768bac711c..1aeea5d65c0159756605dbfbc660dce7e1c4bd13 100644 (file)
--- a/drivers/infiniband/hw/hfi1/affinity.c
+++ b/drivers/infiniband/hw/hfi1/affinity.c
@@ -479,6 +479,8 @@ static int _dev_comp_vect_mappings_create(struct hfi1_devdata *dd,
                           rvt_get_ibdev_name(&(dd)->verbs_dev.rdi), i, cpu);
         }
  
+       free_cpumask_var(available_cpus);
+       free_cpumask_var(non_intr_cpus);
         return 0;
  
  fail:
diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/hw/hfi1/file_ops.c

index bef6946861b24432c99fc4cd06621f9c4ce72d53..259115886d35145a8ffbde8a804b93b7c96dec4a 100644 (file)
--- a/drivers/infiniband/hw/hfi1/file_ops.c
+++ b/drivers/infiniband/hw/hfi1/file_ops.c
@@ -200,23 +200,24 @@ static int hfi1_file_open(struct inode *inode, struct file *fp)
  
         fd = kzalloc(sizeof(*fd), GFP_KERNEL);
  
-       if (fd) {
-               fd->rec_cpu_num = -1; /* no cpu affinity by default */
-               fd->mm = current->mm;
-               mmgrab(fd->mm);
-               fd->dd = dd;
-               kobject_get(&fd->dd->kobj);
-               fp->private_data = fd;
-       } else {
-               fp->private_data = NULL;
-
-               if (atomic_dec_and_test(&dd->user_refcount))
-                       complete(&dd->user_comp);
-
-               return -ENOMEM;
-       }
-
+       if (!fd || init_srcu_struct(&fd->pq_srcu))
+               goto nomem;
+       spin_lock_init(&fd->pq_rcu_lock);
+       spin_lock_init(&fd->tid_lock);
+       spin_lock_init(&fd->invalid_lock);
+       fd->rec_cpu_num = -1; /* no cpu affinity by default */
+       fd->mm = current->mm;
+       mmgrab(fd->mm);
+       fd->dd = dd;
+       kobject_get(&fd->dd->kobj);
+       fp->private_data = fd;
         return 0;
+nomem:
+       kfree(fd);
+       fp->private_data = NULL;
+       if (atomic_dec_and_test(&dd->user_refcount))
+               complete(&dd->user_comp);
+       return -ENOMEM;
  }
  
  static long hfi1_file_ioctl(struct file *fp, unsigned int cmd,
@@ -301,21 +302,30 @@ static long hfi1_file_ioctl(struct file *fp, unsigned int cmd,
  static ssize_t hfi1_write_iter(struct kiocb *kiocb, struct iov_iter *from)
  {
         struct hfi1_filedata *fd = kiocb->ki_filp->private_data;
-       struct hfi1_user_sdma_pkt_q *pq = fd->pq;
+       struct hfi1_user_sdma_pkt_q *pq;
         struct hfi1_user_sdma_comp_q *cq = fd->cq;
         int done = 0, reqs = 0;
         unsigned long dim = from->nr_segs;
+       int idx;
  
-       if (!cq || !pq)
+       idx = srcu_read_lock(&fd->pq_srcu);
+       pq = srcu_dereference(fd->pq, &fd->pq_srcu);
+       if (!cq || !pq) {
+               srcu_read_unlock(&fd->pq_srcu, idx);
                 return -EIO;
+       }
  
-       if (!iter_is_iovec(from) || !dim)
+       if (!iter_is_iovec(from) || !dim) {
+               srcu_read_unlock(&fd->pq_srcu, idx);
                 return -EINVAL;
+       }
  
         trace_hfi1_sdma_request(fd->dd, fd->uctxt->ctxt, fd->subctxt, dim);
  
-       if (atomic_read(&pq->n_reqs) == pq->n_max_reqs)
+       if (atomic_read(&pq->n_reqs) == pq->n_max_reqs) {
+               srcu_read_unlock(&fd->pq_srcu, idx);
                 return -ENOSPC;
+       }
  
         while (dim) {
                 int ret;
@@ -333,6 +343,7 @@ static ssize_t hfi1_write_iter(struct kiocb *kiocb, struct iov_iter *from)
                 reqs++;
         }
  
+       srcu_read_unlock(&fd->pq_srcu, idx);
         return reqs;
  }
  
@@ -707,6 +718,7 @@ done:
         if (atomic_dec_and_test(&dd->user_refcount))
                 complete(&dd->user_comp);
  
+       cleanup_srcu_struct(&fdata->pq_srcu);
         kfree(fdata);
         return 0;
  }
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h

index 6365e8ffed9d4c0f47e3e813977fb379d1e3e982..cae12f416ca0e4e8ac9e29fb0da486cf0c225db6 100644 (file)
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -1444,10 +1444,13 @@ struct mmu_rb_handler;
  
  /* Private data for file operations */
  struct hfi1_filedata {
+       struct srcu_struct pq_srcu;
         struct hfi1_devdata *dd;
         struct hfi1_ctxtdata *uctxt;
         struct hfi1_user_sdma_comp_q *cq;
-       struct hfi1_user_sdma_pkt_q *pq;
+       /* update side lock for SRCU */
+       spinlock_t pq_rcu_lock;
+       struct hfi1_user_sdma_pkt_q __rcu *pq;
         u16 subctxt;
         /* for cpu affinity; -1 if none */
         int rec_cpu_num;
diff --git a/drivers/infiniband/hw/hfi1/user_exp_rcv.c b/drivers/infiniband/hw/hfi1/user_exp_rcv.c

index f05742ac0949712e2480948449bc2a3b74b3445c..4da03f82347492001c5e0fb49af5393087be9b4b 100644 (file)
--- a/drivers/infiniband/hw/hfi1/user_exp_rcv.c
+++ b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
@@ -87,9 +87,6 @@ int hfi1_user_exp_rcv_init(struct hfi1_filedata *fd,
  {
         int ret = 0;
  
-       spin_lock_init(&fd->tid_lock);
-       spin_lock_init(&fd->invalid_lock);
-
         fd->entry_to_rb = kcalloc(uctxt->expected_count,
                                   sizeof(struct rb_node *),
                                   GFP_KERNEL);
@@ -142,10 +139,12 @@ void hfi1_user_exp_rcv_free(struct hfi1_filedata *fd)
  {
         struct hfi1_ctxtdata *uctxt = fd->uctxt;
  
+       mutex_lock(&uctxt->exp_mutex);
         if (!EXP_TID_SET_EMPTY(uctxt->tid_full_list))
                 unlock_exp_tids(uctxt, &uctxt->tid_full_list, fd);
         if (!EXP_TID_SET_EMPTY(uctxt->tid_used_list))
                 unlock_exp_tids(uctxt, &uctxt->tid_used_list, fd);
+       mutex_unlock(&uctxt->exp_mutex);
  
         kfree(fd->invalid_tids);
         fd->invalid_tids = NULL;
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c

index fd754a16475a540de927aa938445856ffc5314b9..c2f0d9ba93de17dc3e5bd7d78f45d5e8a1d36b68 100644 (file)
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -179,7 +179,6 @@ int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt,
         pq = kzalloc(sizeof(*pq), GFP_KERNEL);
         if (!pq)
                 return -ENOMEM;
-
         pq->dd = dd;
         pq->ctxt = uctxt->ctxt;
         pq->subctxt = fd->subctxt;
@@ -236,7 +235,7 @@ int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt,
                 goto pq_mmu_fail;
         }
  
-       fd->pq = pq;
+       rcu_assign_pointer(fd->pq, pq);
         fd->cq = cq;
  
         return 0;
@@ -264,8 +263,14 @@ int hfi1_user_sdma_free_queues(struct hfi1_filedata *fd,
  
         trace_hfi1_sdma_user_free_queues(uctxt->dd, uctxt->ctxt, fd->subctxt);
  
-       pq = fd->pq;
+       spin_lock(&fd->pq_rcu_lock);
+       pq = srcu_dereference_check(fd->pq, &fd->pq_srcu,
+                                   lockdep_is_held(&fd->pq_rcu_lock));
         if (pq) {
+               rcu_assign_pointer(fd->pq, NULL);
+               spin_unlock(&fd->pq_rcu_lock);
+               synchronize_srcu(&fd->pq_srcu);
+               /* at this point there can be no more new requests */
                 if (pq->handler)
                         hfi1_mmu_rb_unregister(pq->handler);
                 iowait_sdma_drain(&pq->busy);
@@ -277,7 +282,8 @@ int hfi1_user_sdma_free_queues(struct hfi1_filedata *fd,
                 kfree(pq->req_in_use);
                 kmem_cache_destroy(pq->txreq_cache);
                 kfree(pq);
-               fd->pq = NULL;
+       } else {
+               spin_unlock(&fd->pq_rcu_lock);
         }
         if (fd->cq) {
                 vfree(fd->cq->comps);
@@ -321,7 +327,8 @@ int hfi1_user_sdma_process_request(struct hfi1_filedata *fd,
  {
         int ret = 0, i;
         struct hfi1_ctxtdata *uctxt = fd->uctxt;
-       struct hfi1_user_sdma_pkt_q *pq = fd->pq;
+       struct hfi1_user_sdma_pkt_q *pq =
+               srcu_dereference(fd->pq, &fd->pq_srcu);
         struct hfi1_user_sdma_comp_q *cq = fd->cq;
         struct hfi1_devdata *dd = pq->dd;
         unsigned long idx = 0;
diff --git a/drivers/infiniband/hw/mlx5/devx.c b/drivers/infiniband/hw/mlx5/devx.c

index d7efc9f6daf09d13499735a527822c1faedd92ca..46e1ab771f106065d7998eaf14cdba29faa851f6 100644 (file)
--- a/drivers/infiniband/hw/mlx5/devx.c
+++ b/drivers/infiniband/hw/mlx5/devx.c
@@ -2319,14 +2319,12 @@ static int deliver_event(struct devx_event_subscription *event_sub,
  
         if (ev_file->omit_data) {
                 spin_lock_irqsave(&ev_file->lock, flags);
-               if (!list_empty(&event_sub->event_list)) {
+               if (!list_empty(&event_sub->event_list) ||
+                   ev_file->is_destroyed) {
                         spin_unlock_irqrestore(&ev_file->lock, flags);
                         return 0;
                 }
  
-               /* is_destroyed is ignored here because we don't have any memory
-                * allocation to clean up for the omit_data case
-                */
                 list_add_tail(&event_sub->event_list, &ev_file->event_list);
                 spin_unlock_irqrestore(&ev_file->lock, flags);
                 wake_up_interruptible(&ev_file->poll_wait);
@@ -2473,11 +2471,11 @@ static ssize_t devx_async_cmd_event_read(struct file *filp, char __user *buf,
                         return -ERESTARTSYS;
                 }
  
-               if (list_empty(&ev_queue->event_list) &&
-                   ev_queue->is_destroyed)
-                       return -EIO;
-
                 spin_lock_irq(&ev_queue->lock);
+               if (ev_queue->is_destroyed) {
+                       spin_unlock_irq(&ev_queue->lock);
+                       return -EIO;
+               }
         }
  
         event = list_entry(ev_queue->event_list.next,
@@ -2551,10 +2549,6 @@ static ssize_t devx_async_event_read(struct file *filp, char __user *buf,
                 return -EOVERFLOW;
         }
  
-       if (ev_file->is_destroyed) {
-               spin_unlock_irq(&ev_file->lock);
-               return -EIO;
-       }
  
         while (list_empty(&ev_file->event_list)) {
                 spin_unlock_irq(&ev_file->lock);
@@ -2667,8 +2661,10 @@ static int devx_async_cmd_event_destroy_uobj(struct ib_uobject *uobj,
  
         spin_lock_irq(&comp_ev_file->ev_queue.lock);
         list_for_each_entry_safe(entry, tmp,
-                                &comp_ev_file->ev_queue.event_list, list)
+                                &comp_ev_file->ev_queue.event_list, list) {
+               list_del(&entry->list);
                 kvfree(entry);
+       }
         spin_unlock_irq(&comp_ev_file->ev_queue.lock);
         return 0;
  };
@@ -2680,11 +2676,29 @@ static int devx_async_event_destroy_uobj(struct ib_uobject *uobj,
                 container_of(uobj, struct devx_async_event_file,
                              uobj);
         struct devx_event_subscription *event_sub, *event_sub_tmp;
-       struct devx_async_event_data *entry, *tmp;
         struct mlx5_ib_dev *dev = ev_file->dev;
  
         spin_lock_irq(&ev_file->lock);
         ev_file->is_destroyed = 1;
+
+       /* free the pending events allocation */
+       if (ev_file->omit_data) {
+               struct devx_event_subscription *event_sub, *tmp;
+
+               list_for_each_entry_safe(event_sub, tmp, &ev_file->event_list,
+                                        event_list)
+                       list_del_init(&event_sub->event_list);
+
+       } else {
+               struct devx_async_event_data *entry, *tmp;
+
+               list_for_each_entry_safe(entry, tmp, &ev_file->event_list,
+                                        list) {
+                       list_del(&entry->list);
+                       kfree(entry);
+               }
+       }
+
         spin_unlock_irq(&ev_file->lock);
         wake_up_interruptible(&ev_file->poll_wait);
  
@@ -2699,15 +2713,6 @@ static int devx_async_event_destroy_uobj(struct ib_uobject *uobj,
         }
         mutex_unlock(&dev->devx_event_table.event_xa_lock);
  
-       /* free the pending events allocation */
-       if (!ev_file->omit_data) {
-               spin_lock_irq(&ev_file->lock);
-               list_for_each_entry_safe(entry, tmp,
-                                        &ev_file->event_list, list)
-                       kfree(entry); /* read can't come any more */
-               spin_unlock_irq(&ev_file->lock);
-       }
-
         put_device(&dev->ib_dev.dev);
         return 0;
  };
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c

index e874d688d040d69eac94c1d680d3f8eb81d422f1..e4bcfa81b70a3eedb003dceeb9634b0fbbf2bd6a 100644 (file)
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2283,8 +2283,8 @@ static int mlx5_ib_mmap_offset(struct mlx5_ib_dev *dev,
  
  static u64 mlx5_entry_to_mmap_offset(struct mlx5_user_mmap_entry *entry)
  {
-       u16 cmd = entry->rdma_entry.start_pgoff >> 16;
-       u16 index = entry->rdma_entry.start_pgoff & 0xFFFF;
+       u64 cmd = (entry->rdma_entry.start_pgoff >> 16) & 0xFFFF;
+       u64 index = entry->rdma_entry.start_pgoff & 0xFFFF;
  
         return (((index >> 8) << 16) | (cmd << MLX5_IB_MMAP_CMD_SHIFT) |
                 (index & 0xFF)) << PAGE_SHIFT;
@@ -6545,7 +6545,7 @@ static int mlx5_ib_init_var_table(struct mlx5_ib_dev *dev)
                                         doorbell_bar_offset);
         bar_size = (1ULL << log_doorbell_bar_size) * 4096;
         var_table->stride_size = 1ULL << log_doorbell_stride;
-       var_table->num_var_hw_entries = bar_size / var_table->stride_size;
+       var_table->num_var_hw_entries = div64_u64(bar_size, var_table->stride_size);
         mutex_init(&var_table->bitmap_lock);
         var_table->bitmap = bitmap_zalloc(var_table->num_var_hw_entries,
                                           GFP_KERNEL);
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c

index a4f8e7030787184761ad57fa26452dbb69ee4b3f..957f3a52589bef9026fadcee8c5601fc6aa0c2d8 100644 (file)
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -3441,9 +3441,6 @@ static int __mlx5_ib_qp_set_counter(struct ib_qp *qp,
         struct mlx5_ib_qp_base *base;
         u32 set_id;
  
-       if (!MLX5_CAP_GEN(dev->mdev, rts2rts_qp_counters_set_id))
-               return 0;
-
         if (counter)
                 set_id = counter->id;
         else
@@ -6576,6 +6573,7 @@ void mlx5_ib_drain_rq(struct ib_qp *qp)
   */
  int mlx5_ib_qp_set_counter(struct ib_qp *qp, struct rdma_counter *counter)
  {
+       struct mlx5_ib_dev *dev = to_mdev(qp->device);
         struct mlx5_ib_qp *mqp = to_mqp(qp);
         int err = 0;
  
@@ -6585,6 +6583,11 @@ int mlx5_ib_qp_set_counter(struct ib_qp *qp, struct rdma_counter *counter)
                 goto out;
         }
  
+       if (!MLX5_CAP_GEN(dev->mdev, rts2rts_qp_counters_set_id)) {
+               err = -EOPNOTSUPP;
+               goto out;
+       }
+
         if (mqp->state == IB_QPS_RTS) {
                 err = __mlx5_ib_qp_set_counter(qp, counter);
                 if (!err)
diff --git a/drivers/infiniband/sw/rdmavt/qp.c b/drivers/infiniband/sw/rdmavt/qp.c

index 3cdf75d0c7a4cf9a0a965087794dff4485373046..7858d499db03390cb73069b0e795b58478c12004 100644 (file)
--- a/drivers/infiniband/sw/rdmavt/qp.c
+++ b/drivers/infiniband/sw/rdmavt/qp.c
@@ -61,6 +61,8 @@
  #define RVT_RWQ_COUNT_THRESHOLD 16
  
  static void rvt_rc_timeout(struct timer_list *t);
+static void rvt_reset_qp(struct rvt_dev_info *rdi, struct rvt_qp *qp,
+                        enum ib_qp_type type);
  
  /*
   * Convert the AETH RNR timeout code into the number of microseconds.
@@ -452,40 +454,41 @@ no_qp_table:
  }
  
  /**
- * free_all_qps - check for QPs still in use
+ * rvt_free_qp_cb - callback function to reset a qp
+ * @qp: the qp to reset
+ * @v: a 64-bit value
+ *
+ * This function resets the qp and removes it from the
+ * qp hash table.
+ */
+static void rvt_free_qp_cb(struct rvt_qp *qp, u64 v)
+{
+       unsigned int *qp_inuse = (unsigned int *)v;
+       struct rvt_dev_info *rdi = ib_to_rvt(qp->ibqp.device);
+
+       /* Reset the qp and remove it from the qp hash list */
+       rvt_reset_qp(rdi, qp, qp->ibqp.qp_type);
+
+       /* Increment the qp_inuse count */
+       (*qp_inuse)++;
+}
+
+/**
+ * rvt_free_all_qps - check for QPs still in use
   * @rdi: rvt device info structure
   *
   * There should not be any QPs still in use.
   * Free memory for table.
+ * Return the number of QPs still in use.
   */
  static unsigned rvt_free_all_qps(struct rvt_dev_info *rdi)
  {
-       unsigned long flags;
-       struct rvt_qp *qp;
-       unsigned n, qp_inuse = 0;
-       spinlock_t *ql; /* work around too long line below */
-
-       if (rdi->driver_f.free_all_qps)
-               qp_inuse = rdi->driver_f.free_all_qps(rdi);
+       unsigned int qp_inuse = 0;
  
         qp_inuse += rvt_mcast_tree_empty(rdi);
  
-       if (!rdi->qp_dev)
-               return qp_inuse;
-
-       ql = &rdi->qp_dev->qpt_lock;
-       spin_lock_irqsave(ql, flags);
-       for (n = 0; n < rdi->qp_dev->qp_table_size; n++) {
-               qp = rcu_dereference_protected(rdi->qp_dev->qp_table[n],
-                                              lockdep_is_held(ql));
-               RCU_INIT_POINTER(rdi->qp_dev->qp_table[n], NULL);
+       rvt_qp_iter(rdi, (u64)&qp_inuse, rvt_free_qp_cb);
  
-               for (; qp; qp = rcu_dereference_protected(qp->next,
-                                                         lockdep_is_held(ql)))
-                       qp_inuse++;
-       }
-       spin_unlock_irqrestore(ql, flags);
-       synchronize_rcu();
         return qp_inuse;
  }
  
@@ -902,14 +905,14 @@ static void rvt_init_qp(struct rvt_dev_info *rdi, struct rvt_qp *qp,
  }
  
  /**
- * rvt_reset_qp - initialize the QP state to the reset state
+ * _rvt_reset_qp - initialize the QP state to the reset state
   * @qp: the QP to reset
   * @type: the QP type
   *
   * r_lock, s_hlock, and s_lock are required to be held by the caller
   */
-static void rvt_reset_qp(struct rvt_dev_info *rdi, struct rvt_qp *qp,
-                        enum ib_qp_type type)
+static void _rvt_reset_qp(struct rvt_dev_info *rdi, struct rvt_qp *qp,
+                         enum ib_qp_type type)
         __must_hold(&qp->s_lock)
         __must_hold(&qp->s_hlock)
         __must_hold(&qp->r_lock)
@@ -955,6 +958,27 @@ static void rvt_reset_qp(struct rvt_dev_info *rdi, struct rvt_qp *qp,
         lockdep_assert_held(&qp->s_lock);
  }
  
+/**
+ * rvt_reset_qp - initialize the QP state to the reset state
+ * @rdi: the device info
+ * @qp: the QP to reset
+ * @type: the QP type
+ *
+ * This is the wrapper function to acquire the r_lock, s_hlock, and s_lock
+ * before calling _rvt_reset_qp().
+ */
+static void rvt_reset_qp(struct rvt_dev_info *rdi, struct rvt_qp *qp,
+                        enum ib_qp_type type)
+{
+       spin_lock_irq(&qp->r_lock);
+       spin_lock(&qp->s_hlock);
+       spin_lock(&qp->s_lock);
+       _rvt_reset_qp(rdi, qp, type);
+       spin_unlock(&qp->s_lock);
+       spin_unlock(&qp->s_hlock);
+       spin_unlock_irq(&qp->r_lock);
+}
+
  /** rvt_free_qpn - Free a qpn from the bit map
   * @qpt: QP table
   * @qpn: queue pair number to free
@@ -1546,7 +1570,7 @@ int rvt_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
         switch (new_state) {
         case IB_QPS_RESET:
                 if (qp->state != IB_QPS_RESET)
-                       rvt_reset_qp(rdi, qp, ibqp->qp_type);
+                       _rvt_reset_qp(rdi, qp, ibqp->qp_type);
                 break;
  
         case IB_QPS_RTR:
@@ -1695,13 +1719,7 @@ int rvt_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)
         struct rvt_qp *qp = ibqp_to_rvtqp(ibqp);
         struct rvt_dev_info *rdi = ib_to_rvt(ibqp->device);
  
-       spin_lock_irq(&qp->r_lock);
-       spin_lock(&qp->s_hlock);
-       spin_lock(&qp->s_lock);
         rvt_reset_qp(rdi, qp, ibqp->qp_type);
-       spin_unlock(&qp->s_lock);
-       spin_unlock(&qp->s_hlock);
-       spin_unlock_irq(&qp->r_lock);
  
         wait_event(qp->wait, !atomic_read(&qp->refcount));
         /* qpn is now available for use again */
diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c

index 116cafc9afcf601a2a040831668947be785c49ed..4bc88708b355885b3f429e75af0dc3cb1975cd90 100644 (file)
--- a/drivers/infiniband/sw/rxe/rxe_comp.c
+++ b/drivers/infiniband/sw/rxe/rxe_comp.c
@@ -329,7 +329,7 @@ static inline enum comp_state check_ack(struct rxe_qp *qp,
                                         qp->comp.psn = pkt->psn;
                                         if (qp->req.wait_psn) {
                                                 qp->req.wait_psn = 0;
-                                               rxe_run_task(&qp->req.task, 1);
+                                               rxe_run_task(&qp->req.task, 0);
                                         }
                                 }
                                 return COMPST_ERROR_RETRY;
@@ -463,7 +463,7 @@ static void do_complete(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
          */
         if (qp->req.wait_fence) {
                 qp->req.wait_fence = 0;
-               rxe_run_task(&qp->req.task, 1);
+               rxe_run_task(&qp->req.task, 0);
         }
  }
  
@@ -479,7 +479,7 @@ static inline enum comp_state complete_ack(struct rxe_qp *qp,
                 if (qp->req.need_rd_atomic) {
                         qp->comp.timeout_retry = 0;
                         qp->req.need_rd_atomic = 0;
-                       rxe_run_task(&qp->req.task, 1);
+                       rxe_run_task(&qp->req.task, 0);
                 }
         }
  
@@ -725,7 +725,7 @@ int rxe_completer(void *arg)
                                                         RXE_CNT_COMP_RETRY);
                                         qp->req.need_retry = 1;
                                         qp->comp.started_retry = 1;
-                                       rxe_run_task(&qp->req.task, 1);
+                                       rxe_run_task(&qp->req.task, 0);
                                 }
  
                                 if (pkt) {
diff --git a/drivers/infiniband/sw/siw/siw_cm.c b/drivers/infiniband/sw/siw/siw_cm.c

index 0c3f0588346efa317e152c9496796fe999b552db..c5651a96b196480a4537cb81f46bf88a0664ddd0 100644 (file)
--- a/drivers/infiniband/sw/siw/siw_cm.c
+++ b/drivers/infiniband/sw/siw/siw_cm.c
@@ -1225,10 +1225,9 @@ static void siw_cm_llp_data_ready(struct sock *sk)
         read_lock(&sk->sk_callback_lock);
  
         cep = sk_to_cep(sk);
-       if (!cep) {
-               WARN_ON(1);
+       if (!cep)
                 goto out;
-       }
+
         siw_dbg_cep(cep, "state: %d\n", cep->state);
  
         switch (cep->state) {
diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c

index b273e421e9103f9a4aa7652bda41f6c08a9dc7b6..a1a035270cabf0b0dac14542b5f879514c7ce2e1 100644 (file)
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -2575,6 +2575,17 @@ isert_wait4logout(struct isert_conn *isert_conn)
         }
  }
  
+static void
+isert_wait4cmds(struct iscsi_conn *conn)
+{
+       isert_info("iscsi_conn %p\n", conn);
+
+       if (conn->sess) {
+               target_sess_cmd_list_set_waiting(conn->sess->se_sess);
+               target_wait_for_sess_cmds(conn->sess->se_sess);
+       }
+}
+
  /**
   * isert_put_unsol_pending_cmds() - Drop commands waiting for
   *     unsolicitate dataout
@@ -2622,6 +2633,7 @@ static void isert_wait_conn(struct iscsi_conn *conn)
  
         ib_drain_qp(isert_conn->qp);
         isert_put_unsol_pending_cmds(conn);
+       isert_wait4cmds(conn);
         isert_wait4logout(isert_conn);
  
         queue_work(isert_release_wq, &isert_conn->release_work);
diff --git a/drivers/input/keyboard/goldfish_events.c b/drivers/input/keyboard/goldfish_events.c

index bc8c85a52a10ce82f8be1695886af013013203ab..57d435fc5c73da8bfbf0db2773b8b0fe6a54fc3c 100644 (file)
--- a/drivers/input/keyboard/goldfish_events.c
+++ b/drivers/input/keyboard/goldfish_events.c
@@ -30,7 +30,7 @@ struct event_dev {
         struct input_dev *input;
         int irq;
         void __iomem *addr;
-       char name[0];
+       char name[];
  };
  
  static irqreturn_t events_interrupt(int irq, void *dev_id)
diff --git a/drivers/input/keyboard/gpio_keys.c b/drivers/input/keyboard/gpio_keys.c

index 1f56d53454b22c0a6590b28cb89d47ab48d38a60..53c9ff338dea405076283bc6238109a76c0e618e 100644 (file)
--- a/drivers/input/keyboard/gpio_keys.c
+++ b/drivers/input/keyboard/gpio_keys.c
@@ -55,7 +55,7 @@ struct gpio_keys_drvdata {
         struct input_dev *input;
         struct mutex disable_lock;
         unsigned short *keymap;
-       struct gpio_button_data data[0];
+       struct gpio_button_data data[];
  };
  
  /*
diff --git a/drivers/input/keyboard/gpio_keys_polled.c b/drivers/input/keyboard/gpio_keys_polled.c

index 6eb0a2f3f9de7954c33a89477537fd5b7281ce85..c3937d2fc7446e8b68e517cb4e5491809407fb06 100644 (file)
--- a/drivers/input/keyboard/gpio_keys_polled.c
+++ b/drivers/input/keyboard/gpio_keys_polled.c
@@ -38,7 +38,7 @@ struct gpio_keys_polled_dev {
         const struct gpio_keys_platform_data *pdata;
         unsigned long rel_axis_seen[BITS_TO_LONGS(REL_CNT)];
         unsigned long abs_axis_seen[BITS_TO_LONGS(ABS_CNT)];
-       struct gpio_keys_button_data data[0];
+       struct gpio_keys_button_data data[];
  };
  
  static void gpio_keys_button_event(struct input_dev *input,
diff --git a/drivers/input/keyboard/tca6416-keypad.c b/drivers/input/keyboard/tca6416-keypad.c

index 2a14769de63705e4a01f36716061fe8eb320e144..21758767ccf063613a755f8f430f0f62c01acc88 100644 (file)
--- a/drivers/input/keyboard/tca6416-keypad.c
+++ b/drivers/input/keyboard/tca6416-keypad.c
@@ -33,7 +33,7 @@ MODULE_DEVICE_TABLE(i2c, tca6416_id);
  
  struct tca6416_drv_data {
         struct input_dev *input;
-       struct tca6416_button data[0];
+       struct tca6416_button data[];
  };
  
  struct tca6416_keypad_chip {
@@ -48,7 +48,7 @@ struct tca6416_keypad_chip {
         int irqnum;
         u16 pinmask;
         bool use_polling;
-       struct tca6416_button buttons[0];
+       struct tca6416_button buttons[];
  };
  
  static int tca6416_write_reg(struct tca6416_keypad_chip *chip, int reg, u16 val)
diff --git a/drivers/input/mouse/cyapa_gen5.c b/drivers/input/mouse/cyapa_gen5.c

index 14239fbd72cf2032e193558dca87833ce4df4afd..7f012bfa26583d5133c59b76b788b62b30f69172 100644 (file)
--- a/drivers/input/mouse/cyapa_gen5.c
+++ b/drivers/input/mouse/cyapa_gen5.c
@@ -250,7 +250,7 @@ struct cyapa_tsg_bin_image_data_record {
  
  struct cyapa_tsg_bin_image {
         struct cyapa_tsg_bin_image_head image_head;
-       struct cyapa_tsg_bin_image_data_record records[0];
+       struct cyapa_tsg_bin_image_data_record records[];
  } __packed;
  
  struct pip_bl_packet_start {
@@ -271,7 +271,7 @@ struct pip_bl_cmd_head {
         u8 report_id;  /* Bootloader output report id, must be 40h */
         u8 rsvd;  /* Reserved, must be 0 */
         struct pip_bl_packet_start packet_start;
-       u8 data[0];  /* Command data variable based on commands */
+       u8 data[];  /* Command data variable based on commands */
  } __packed;
  
  /* Initiate bootload command data structure. */
@@ -300,7 +300,7 @@ struct tsg_bl_metadata_row_params {
  struct tsg_bl_flash_row_head {
         u8 flash_array_id;
         __le16 flash_row_id;
-       u8 flash_data[0];
+       u8 flash_data[];
  } __packed;
  
  struct pip_app_cmd_head {
@@ -314,7 +314,7 @@ struct pip_app_cmd_head {
          * Bit 6-0: command code.
          */
         u8 cmd_code;
-       u8 parameter_data[0];  /* Parameter data variable based on cmd_code */
+       u8 parameter_data[];  /* Parameter data variable based on cmd_code */
  } __packed;
  
  /* Application get/set parameter command data structure */
diff --git a/drivers/input/mouse/psmouse-smbus.c b/drivers/input/mouse/psmouse-smbus.c

index 027efdd2b2adfbcffce91f071744ddd990853eba..a472489ccbad6895524ecf5b701f47c819f9e20e 100644 (file)
--- a/drivers/input/mouse/psmouse-smbus.c
+++ b/drivers/input/mouse/psmouse-smbus.c
@@ -190,6 +190,7 @@ static int psmouse_smbus_create_companion(struct device *dev, void *data)
         struct psmouse_smbus_dev *smbdev = data;
         unsigned short addr_list[] = { smbdev->board.addr, I2C_CLIENT_END };
         struct i2c_adapter *adapter;
+       struct i2c_client *client;
  
         adapter = i2c_verify_adapter(dev);
         if (!adapter)
@@ -198,12 +199,13 @@ static int psmouse_smbus_create_companion(struct device *dev, void *data)
         if (!i2c_check_functionality(adapter, I2C_FUNC_SMBUS_HOST_NOTIFY))
                 return 0;
  
-       smbdev->client = i2c_new_probed_device(adapter, &smbdev->board,
-                                              addr_list, NULL);
-       if (!smbdev->client)
+       client = i2c_new_scanned_device(adapter, &smbdev->board,
+                                       addr_list, NULL);
+       if (IS_ERR(client))
                 return 0;
  
         /* We have our(?) device, stop iterating i2c bus. */
+       smbdev->client = client;
         return 1;
  }
  
diff --git a/drivers/input/mouse/synaptics.c b/drivers/input/mouse/synaptics.c

index 1ae6f8bba9ae15acc09f175ffb682a38a1ba19ba..2c666fb34625ad0df43a4b166e3e1b44e0fbe5d7 100644 (file)
--- a/drivers/input/mouse/synaptics.c
+++ b/drivers/input/mouse/synaptics.c
@@ -146,7 +146,6 @@ static const char * const topbuttonpad_pnp_ids[] = {
         "LEN0042", /* Yoga */
         "LEN0045",
         "LEN0047",
-       "LEN0049",
         "LEN2000", /* S540 */
         "LEN2001", /* Edge E431 */
         "LEN2002", /* Edge E531 */
@@ -166,9 +165,11 @@ static const char * const smbus_pnp_ids[] = {
         /* all of the topbuttonpad_pnp_ids are valid, we just add some extras */
         "LEN0048", /* X1 Carbon 3 */
         "LEN0046", /* X250 */
+       "LEN0049", /* Yoga 11e */
         "LEN004a", /* W541 */
         "LEN005b", /* P50 */
         "LEN005e", /* T560 */
+       "LEN006c", /* T470s */
         "LEN0071", /* T480 */
         "LEN0072", /* X1 Carbon Gen 5 (2017) - Elan/ALPS trackpoint */
         "LEN0073", /* X1 Carbon G5 (Elantech) */
@@ -179,6 +180,7 @@ static const char * const smbus_pnp_ids[] = {
         "LEN0097", /* X280 -> ALPS trackpoint */
         "LEN009b", /* T580 */
         "LEN200f", /* T450s */
+       "LEN2044", /* L470  */
         "LEN2054", /* E480 */
         "LEN2055", /* E580 */
         "SYN3052", /* HP EliteBook 840 G4 */
diff --git a/drivers/input/touchscreen/ili210x.c b/drivers/input/touchscreen/ili210x.c

index 4a17096e83e11679a26579148daba4cc17330d6b..199cf3daec10661d13e248cec654b3fddd3b4ae6 100644 (file)
--- a/drivers/input/touchscreen/ili210x.c
+++ b/drivers/input/touchscreen/ili210x.c
@@ -167,6 +167,36 @@ static const struct ili2xxx_chip ili211x_chip = {
         .resolution             = 2048,
  };
  
+static bool ili212x_touchdata_to_coords(const u8 *touchdata,
+                                       unsigned int finger,
+                                       unsigned int *x, unsigned int *y)
+{
+       u16 val;
+
+       val = get_unaligned_be16(touchdata + 3 + (finger * 5) + 0);
+       if (!(val & BIT(15)))   /* Touch indication */
+               return false;
+
+       *x = val & 0x3fff;
+       *y = get_unaligned_be16(touchdata + 3 + (finger * 5) + 2);
+
+       return true;
+}
+
+static bool ili212x_check_continue_polling(const u8 *data, bool touch)
+{
+       return touch;
+}
+
+static const struct ili2xxx_chip ili212x_chip = {
+       .read_reg               = ili210x_read_reg,
+       .get_touch_data         = ili210x_read_touch_data,
+       .parse_touch_data       = ili212x_touchdata_to_coords,
+       .continue_polling       = ili212x_check_continue_polling,
+       .max_touches            = 10,
+       .has_calibrate_reg      = true,
+};
+
  static int ili251x_read_reg(struct i2c_client *client,
                             u8 reg, void *buf, size_t len)
  {
@@ -321,7 +351,7 @@ static umode_t ili210x_calibrate_visible(struct kobject *kobj,
         struct i2c_client *client = to_i2c_client(dev);
         struct ili210x *priv = i2c_get_clientdata(client);
  
-       return priv->chip->has_calibrate_reg;
+       return priv->chip->has_calibrate_reg ? attr->mode : 0;
  }
  
  static const struct attribute_group ili210x_attr_group = {
@@ -447,6 +477,7 @@ static int ili210x_i2c_probe(struct i2c_client *client,
  static const struct i2c_device_id ili210x_i2c_id[] = {
         { "ili210x", (long)&ili210x_chip },
         { "ili2117", (long)&ili211x_chip },
+       { "ili2120", (long)&ili212x_chip },
         { "ili251x", (long)&ili251x_chip },
         { }
  };
@@ -455,6 +486,7 @@ MODULE_DEVICE_TABLE(i2c, ili210x_i2c_id);
  static const struct of_device_id ili210x_dt_ids[] = {
         { .compatible = "ilitek,ili210x", .data = &ili210x_chip },
         { .compatible = "ilitek,ili2117", .data = &ili211x_chip },
+       { .compatible = "ilitek,ili2120", .data = &ili212x_chip },
         { .compatible = "ilitek,ili251x", .data = &ili251x_chip },
         { }
  };
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile

index 2104fb8afc0660c2516df9dd5a5d4b6c31d21b8b..9f33fdb3bb0516feb086741a2b4ac89a9e6331dc 100644 (file)
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -14,8 +14,8 @@ obj-$(CONFIG_MSM_IOMMU) += msm_iommu.o
  obj-$(CONFIG_AMD_IOMMU) += amd_iommu.o amd_iommu_init.o amd_iommu_quirks.o
  obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += amd_iommu_debugfs.o
  obj-$(CONFIG_AMD_IOMMU_V2) += amd_iommu_v2.o
-obj-$(CONFIG_ARM_SMMU) += arm-smmu-mod.o
-arm-smmu-mod-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-qcom.o
+obj-$(CONFIG_ARM_SMMU) += arm_smmu.o
+arm_smmu-objs += arm-smmu.o arm-smmu-impl.o arm-smmu-qcom.o
  obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
  obj-$(CONFIG_DMAR_TABLE) += dmar.o
  obj-$(CONFIG_INTEL_IOMMU) += intel-iommu.o intel-pasid.o
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c

index 2759a8d57b7f91be441012c23c01bb6d6b408907..6be3853a5d978e09b8aeebb02daaec01e3a80cc9 100644 (file)
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -2523,6 +2523,7 @@ static int __init early_amd_iommu_init(void)
         struct acpi_table_header *ivrs_base;
         acpi_status status;
         int i, remap_cache_sz, ret = 0;
+       u32 pci_id;
  
         if (!amd_iommu_detected)
                 return -ENODEV;
@@ -2610,6 +2611,16 @@ static int __init early_amd_iommu_init(void)
         if (ret)
                 goto out;
  
+       /* Disable IOMMU if there's Stoney Ridge graphics */
+       for (i = 0; i < 32; i++) {
+               pci_id = read_pci_config(0, i, 0, 0);
+               if ((pci_id & 0xffff) == 0x1002 && (pci_id >> 16) == 0x98e4) {
+                       pr_info("Disable IOMMU on Stoney Ridge\n");
+                       amd_iommu_disabled = true;
+                       break;
+               }
+       }
+
         /* Disable any previously enabled IOMMUs */
         if (!is_kdump_kernel() || amd_iommu_disabled)
                 disable_iommus();
@@ -2718,7 +2729,7 @@ static int __init state_next(void)
                 ret = early_amd_iommu_init();
                 init_state = ret ? IOMMU_INIT_ERROR : IOMMU_ACPI_FINISHED;
                 if (init_state == IOMMU_ACPI_FINISHED && amd_iommu_disabled) {
-                       pr_info("AMD IOMMU disabled on kernel command-line\n");
+                       pr_info("AMD IOMMU disabled\n");
                         init_state = IOMMU_CMDLINE_DISABLED;
                         ret = -EINVAL;
                 }
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c

index 9dc37672bf893bb70808a9fdef0a850eba685f01..6fa6de2b6ad586d933945f74de0bc79ef8691664 100644 (file)
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -762,6 +762,11 @@ static int iommu_dummy(struct device *dev)
         return dev->archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO;
  }
  
+static bool attach_deferred(struct device *dev)
+{
+       return dev->archdata.iommu == DEFER_DEVICE_DOMAIN_INFO;
+}
+
  /**
   * is_downstream_to_pci_bridge - test if a device belongs to the PCI
   *                              sub-hierarchy of a candidate PCI-PCI bridge
@@ -2510,8 +2515,7 @@ struct dmar_domain *find_domain(struct device *dev)
  {
         struct device_domain_info *info;
  
-       if (unlikely(dev->archdata.iommu == DEFER_DEVICE_DOMAIN_INFO ||
-                    dev->archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO))
+       if (unlikely(attach_deferred(dev) || iommu_dummy(dev)))
                 return NULL;
  
         if (dev_is_pci(dev))
@@ -2525,18 +2529,14 @@ struct dmar_domain *find_domain(struct device *dev)
         return NULL;
  }
  
-static struct dmar_domain *deferred_attach_domain(struct device *dev)
+static void do_deferred_attach(struct device *dev)
  {
-       if (unlikely(dev->archdata.iommu == DEFER_DEVICE_DOMAIN_INFO)) {
-               struct iommu_domain *domain;
-
-               dev->archdata.iommu = NULL;
-               domain = iommu_get_domain_for_dev(dev);
-               if (domain)
-                       intel_iommu_attach_device(domain, dev);
-       }
+       struct iommu_domain *domain;
  
-       return find_domain(dev);
+       dev->archdata.iommu = NULL;
+       domain = iommu_get_domain_for_dev(dev);
+       if (domain)
+               intel_iommu_attach_device(domain, dev);
  }
  
  static inline struct device_domain_info *
@@ -2916,7 +2916,7 @@ static int identity_mapping(struct device *dev)
         struct device_domain_info *info;
  
         info = dev->archdata.iommu;
-       if (info && info != DUMMY_DEVICE_DOMAIN_INFO && info != DEFER_DEVICE_DOMAIN_INFO)
+       if (info)
                 return (info->domain == si_domain);
  
         return 0;
@@ -3587,6 +3587,9 @@ static bool iommu_need_mapping(struct device *dev)
         if (iommu_dummy(dev))
                 return false;
  
+       if (unlikely(attach_deferred(dev)))
+               do_deferred_attach(dev);
+
         ret = identity_mapping(dev);
         if (ret) {
                 u64 dma_mask = *dev->dma_mask;
@@ -3635,7 +3638,7 @@ static dma_addr_t __intel_map_single(struct device *dev, phys_addr_t paddr,
  
         BUG_ON(dir == DMA_NONE);
  
-       domain = deferred_attach_domain(dev);
+       domain = find_domain(dev);
         if (!domain)
                 return DMA_MAPPING_ERROR;
  
@@ -3855,7 +3858,7 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
         if (!iommu_need_mapping(dev))
                 return dma_direct_map_sg(dev, sglist, nelems, dir, attrs);
  
-       domain = deferred_attach_domain(dev);
+       domain = find_domain(dev);
         if (!domain)
                 return 0;
  
@@ -3950,7 +3953,11 @@ bounce_map_single(struct device *dev, phys_addr_t paddr, size_t size,
         int prot = 0;
         int ret;
  
-       domain = deferred_attach_domain(dev);
+       if (unlikely(attach_deferred(dev)))
+               do_deferred_attach(dev);
+
+       domain = find_domain(dev);
+
         if (WARN_ON(dir == DMA_NONE || !domain))
                 return DMA_MAPPING_ERROR;
  
@@ -6133,7 +6140,7 @@ intel_iommu_aux_get_pasid(struct iommu_domain *domain, struct device *dev)
  static bool intel_iommu_is_attach_deferred(struct iommu_domain *domain,
                                            struct device *dev)
  {
-       return dev->archdata.iommu == DEFER_DEVICE_DOMAIN_INFO;
+       return attach_deferred(dev);
  }
  
  static int
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c

index 39759db4f0038c0ec7dd49928b75e6d89e5cae86..4328da0b0a9fdfa35b1d7f951b72e3599d195a31 100644 (file)
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -344,21 +344,19 @@ static void qcom_iommu_domain_free(struct iommu_domain *domain)
  {
         struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
  
-       if (WARN_ON(qcom_domain->iommu))    /* forgot to detach? */
-               return;
-
         iommu_put_dma_cookie(domain);
  
-       /* NOTE: unmap can be called after client device is powered off,
-        * for example, with GPUs or anything involving dma-buf.  So we
-        * cannot rely on the device_link.  Make sure the IOMMU is on to
-        * avoid unclocked accesses in the TLB inv path:
-        */
-       pm_runtime_get_sync(qcom_domain->iommu->dev);
-
-       free_io_pgtable_ops(qcom_domain->pgtbl_ops);
-
-       pm_runtime_put_sync(qcom_domain->iommu->dev);
+       if (qcom_domain->iommu) {
+               /*
+                * NOTE: unmap can be called after client device is powered
+                * off, for example, with GPUs or anything involving dma-buf.
+                * So we cannot rely on the device_link.  Make sure the IOMMU
+                * is on to avoid unclocked accesses in the TLB inv path:
+                */
+               pm_runtime_get_sync(qcom_domain->iommu->dev);
+               free_io_pgtable_ops(qcom_domain->pgtbl_ops);
+               pm_runtime_put_sync(qcom_domain->iommu->dev);
+       }
  
         kfree(qcom_domain);
  }
@@ -404,7 +402,7 @@ static void qcom_iommu_detach_dev(struct iommu_domain *domain, struct device *de
         struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
         unsigned i;
  
-       if (!qcom_domain->iommu)
+       if (WARN_ON(!qcom_domain->iommu))
                 return;
  
         pm_runtime_get_sync(qcom_iommu->dev);
@@ -417,8 +415,6 @@ static void qcom_iommu_detach_dev(struct iommu_domain *domain, struct device *de
                 ctx->domain = NULL;
         }
         pm_runtime_put_sync(qcom_iommu->dev);
-
-       qcom_domain->iommu = NULL;
  }
  
  static int qcom_iommu_map(struct iommu_domain *domain, unsigned long iova,
diff --git a/drivers/macintosh/therm_windtunnel.c b/drivers/macintosh/therm_windtunnel.c

index 8c744578122a3e229079571b7bce093380c90662..a0d87ed9da69612224903eb192da987e3d5f5504 100644 (file)
--- a/drivers/macintosh/therm_windtunnel.c
+++ b/drivers/macintosh/therm_windtunnel.c
@@ -300,9 +300,11 @@ static int control_loop(void *dummy)
  /*     i2c probing and setup                                           */
  /************************************************************************/
  
-static int
-do_attach( struct i2c_adapter *adapter )
+static void do_attach(struct i2c_adapter *adapter)
  {
+       struct i2c_board_info info = { };
+       struct device_node *np;
+
         /* scan 0x48-0x4f (DS1775) and 0x2c-2x2f (ADM1030) */
         static const unsigned short scan_ds1775[] = {
                 0x48, 0x49, 0x4a, 0x4b, 0x4c, 0x4d, 0x4e, 0x4f,
@@ -313,25 +315,24 @@ do_attach( struct i2c_adapter *adapter )
                 I2C_CLIENT_END
         };
  
-       if( strncmp(adapter->name, "uni-n", 5) )
-               return 0;
-
-       if( !x.running ) {
-               struct i2c_board_info info;
+       if (x.running || strncmp(adapter->name, "uni-n", 5))
+               return;
  
-               memset(&info, 0, sizeof(struct i2c_board_info));
-               strlcpy(info.type, "therm_ds1775", I2C_NAME_SIZE);
+       np = of_find_compatible_node(adapter->dev.of_node, NULL, "MAC,ds1775");
+       if (np) {
+               of_node_put(np);
+       } else {
+               strlcpy(info.type, "MAC,ds1775", I2C_NAME_SIZE);
                 i2c_new_probed_device(adapter, &info, scan_ds1775, NULL);
+       }
  
-               strlcpy(info.type, "therm_adm1030", I2C_NAME_SIZE);
+       np = of_find_compatible_node(adapter->dev.of_node, NULL, "MAC,adm1030");
+       if (np) {
+               of_node_put(np);
+       } else {
+               strlcpy(info.type, "MAC,adm1030", I2C_NAME_SIZE);
                 i2c_new_probed_device(adapter, &info, scan_adm1030, NULL);
-
-               if( x.thermostat && x.fan ) {
-                       x.running = 1;
-                       x.poll_task = kthread_run(control_loop, NULL, "g4fand");
-               }
         }
-       return 0;
  }
  
  static int
@@ -404,8 +405,8 @@ out:
  enum chip { ds1775, adm1030 };
  
  static const struct i2c_device_id therm_windtunnel_id[] = {
-       { "therm_ds1775", ds1775 },
-       { "therm_adm1030", adm1030 },
+       { "MAC,ds1775", ds1775 },
+       { "MAC,adm1030", adm1030 },
         { }
  };
  MODULE_DEVICE_TABLE(i2c, therm_windtunnel_id);
@@ -414,6 +415,7 @@ static int
  do_probe(struct i2c_client *cl, const struct i2c_device_id *id)
  {
         struct i2c_adapter *adapter = cl->adapter;
+       int ret = 0;
  
         if( !i2c_check_functionality(adapter, I2C_FUNC_SMBUS_WORD_DATA
                                      | I2C_FUNC_SMBUS_WRITE_BYTE) )
@@ -421,11 +423,19 @@ do_probe(struct i2c_client *cl, const struct i2c_device_id *id)
  
         switch (id->driver_data) {
         case adm1030:
-               return attach_fan( cl );
+               ret = attach_fan(cl);
+               break;
         case ds1775:
-               return attach_thermostat(cl);
+               ret = attach_thermostat(cl);
+               break;
         }
-       return 0;
+
+       if (!x.running && x.thermostat && x.fan) {
+               x.running = 1;
+               x.poll_task = kthread_run(control_loop, NULL, "g4fand");
+       }
+
+       return ret;
  }
  
  static struct i2c_driver g4fan_driver = {
diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c

index a1df0d95151c67b15d78802193be03530441d4d0..8bc1faf71ff2fc747129095c5d27cbd9365e9e1c 100644 (file)
--- a/drivers/md/bcache/alloc.c
+++ b/drivers/md/bcache/alloc.c
@@ -67,6 +67,7 @@
  #include <linux/blkdev.h>
  #include <linux/kthread.h>
  #include <linux/random.h>
+#include <linux/sched/signal.h>
  #include <trace/events/bcache.h>
  
  #define MAX_OPEN_BUCKETS 128
@@ -733,8 +734,21 @@ int bch_open_buckets_alloc(struct cache_set *c)
  
  int bch_cache_allocator_start(struct cache *ca)
  {
-       struct task_struct *k = kthread_run(bch_allocator_thread,
-                                           ca, "bcache_allocator");
+       struct task_struct *k;
+
+       /*
+        * In case previous btree check operation occupies too many
+        * system memory for bcache btree node cache, and the
+        * registering process is selected by OOM killer. Here just
+        * ignore the SIGKILL sent by OOM killer if there is, to
+        * avoid kthread_run() being failed by pending signals. The
+        * bcache registering process will exit after the registration
+        * done.
+        */
+       if (signal_pending(current))
+               flush_signals(current);
+
+       k = kthread_run(bch_allocator_thread, ca, "bcache_allocator");
         if (IS_ERR(k))
                 return PTR_ERR(k);
  
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c

index fa872df4e7703fb65b229c2ddcac9dbadad6593c..b12186c87f52df8e427f762c810d4b5ea441dede 100644 (file)
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -34,6 +34,7 @@
  #include <linux/random.h>
  #include <linux/rcupdate.h>
  #include <linux/sched/clock.h>
+#include <linux/sched/signal.h>
  #include <linux/rculist.h>
  #include <linux/delay.h>
  #include <trace/events/bcache.h>
@@ -1913,6 +1914,18 @@ static int bch_gc_thread(void *arg)
  
  int bch_gc_thread_start(struct cache_set *c)
  {
+       /*
+        * In case previous btree check operation occupies too many
+        * system memory for bcache btree node cache, and the
+        * registering process is selected by OOM killer. Here just
+        * ignore the SIGKILL sent by OOM killer if there is, to
+        * avoid kthread_run() being failed by pending signals. The
+        * bcache registering process will exit after the registration
+        * done.
+        */
+       if (signal_pending(current))
+               flush_signals(current);
+
         c->gc_thread = kthread_run(bch_gc_thread, c, "bcache_gc");
         return PTR_ERR_OR_ZERO(c->gc_thread);
  }
diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c

index 6730820780b067c918e15e63ee9086c3c8cbd008..0e3ff9745ac742d5fb72bdcda46ae227dd37ee62 100644 (file)
--- a/drivers/md/bcache/journal.c
+++ b/drivers/md/bcache/journal.c
@@ -417,8 +417,6 @@ err:
  
  /* Journalling */
  
-#define nr_to_fifo_front(p, front_p, mask)     (((p) - (front_p)) & (mask))
-
  static void btree_flush_write(struct cache_set *c)
  {
         struct btree *b, *t, *btree_nodes[BTREE_FLUSH_NR];
@@ -510,9 +508,8 @@ static void btree_flush_write(struct cache_set *c)
                  *   journal entry can be reclaimed). These selected nodes
                  *   will be ignored and skipped in the folowing for-loop.
                  */
-               if (nr_to_fifo_front(btree_current_write(b)->journal,
-                                    fifo_front_p,
-                                    mask) != 0) {
+               if (((btree_current_write(b)->journal - fifo_front_p) &
+                    mask) != 0) {
                         mutex_unlock(&b->write_lock);
                         continue;
                 }
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c

index 2749daf0972425b0186d05ebae5239641feed3d8..0c3c5419c52b67a0fafd743ed4e509a330a2dabb 100644 (file)
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1917,23 +1917,6 @@ static int run_cache_set(struct cache_set *c)
                 if (bch_btree_check(c))
                         goto err;
  
-               /*
-                * bch_btree_check() may occupy too much system memory which
-                * has negative effects to user space application (e.g. data
-                * base) performance. Shrink the mca cache memory proactively
-                * here to avoid competing memory with user space workloads..
-                */
-               if (!c->shrinker_disabled) {
-                       struct shrink_control sc;
-
-                       sc.gfp_mask = GFP_KERNEL;
-                       sc.nr_to_scan = c->btree_cache_used * c->btree_pages;
-                       /* first run to clear b->accessed tag */
-                       c->shrink.scan_objects(&c->shrink, &sc);
-                       /* second run to reap non-accessed nodes */
-                       c->shrink.scan_objects(&c->shrink, &sc);
-               }
-
                 bch_journal_mark(c, &journal);
                 bch_initial_gc_finish(c);
                 pr_debug("btree_check() done");
diff --git a/drivers/misc/habanalabs/device.c b/drivers/misc/habanalabs/device.c

index b155e95490761271262d46c4c46adface12c561a..b680b0caa69be91ab423391ea58e47724351e561 100644 (file)
--- a/drivers/misc/habanalabs/device.c
+++ b/drivers/misc/habanalabs/device.c
@@ -598,7 +598,9 @@ int hl_device_set_debug_mode(struct hl_device *hdev, bool enable)
                         goto out;
                 }
  
-               hdev->asic_funcs->halt_coresight(hdev);
+               if (!hdev->hard_reset_pending)
+                       hdev->asic_funcs->halt_coresight(hdev);
+
                 hdev->in_debug = 0;
  
                 goto out;
@@ -1189,6 +1191,7 @@ int hl_device_init(struct hl_device *hdev, struct class *hclass)
         if (hdev->asic_funcs->get_hw_state(hdev) == HL_DEVICE_HW_STATE_DIRTY) {
                 dev_info(hdev->dev,
                         "H/W state is dirty, must reset before initializing\n");
+               hdev->asic_funcs->halt_engines(hdev, true);
                 hdev->asic_funcs->hw_fini(hdev, true);
         }
  
diff --git a/drivers/misc/habanalabs/goya/goya.c b/drivers/misc/habanalabs/goya/goya.c

index 7344e8a222ae567fd3f525c242ce49a1750fe3ff..b8a8de24aaf722a3c2f4abee4f4720b616c9761b 100644 (file)
--- a/drivers/misc/habanalabs/goya/goya.c
+++ b/drivers/misc/habanalabs/goya/goya.c
@@ -895,6 +895,11 @@ void goya_init_dma_qmans(struct hl_device *hdev)
   */
  static void goya_disable_external_queues(struct hl_device *hdev)
  {
+       struct goya_device *goya = hdev->asic_specific;
+
+       if (!(goya->hw_cap_initialized & HW_CAP_DMA))
+               return;
+
         WREG32(mmDMA_QM_0_GLBL_CFG0, 0);
         WREG32(mmDMA_QM_1_GLBL_CFG0, 0);
         WREG32(mmDMA_QM_2_GLBL_CFG0, 0);
@@ -956,6 +961,11 @@ static int goya_stop_external_queues(struct hl_device *hdev)
  {
         int rc, retval = 0;
  
+       struct goya_device *goya = hdev->asic_specific;
+
+       if (!(goya->hw_cap_initialized & HW_CAP_DMA))
+               return retval;
+
         rc = goya_stop_queue(hdev,
                         mmDMA_QM_0_GLBL_CFG1,
                         mmDMA_QM_0_CP_STS,
@@ -1744,9 +1754,18 @@ void goya_init_tpc_qmans(struct hl_device *hdev)
   */
  static void goya_disable_internal_queues(struct hl_device *hdev)
  {
+       struct goya_device *goya = hdev->asic_specific;
+
+       if (!(goya->hw_cap_initialized & HW_CAP_MME))
+               goto disable_tpc;
+
         WREG32(mmMME_QM_GLBL_CFG0, 0);
         WREG32(mmMME_CMDQ_GLBL_CFG0, 0);
  
+disable_tpc:
+       if (!(goya->hw_cap_initialized & HW_CAP_TPC))
+               return;
+
         WREG32(mmTPC0_QM_GLBL_CFG0, 0);
         WREG32(mmTPC0_CMDQ_GLBL_CFG0, 0);
  
@@ -1782,8 +1801,12 @@ static void goya_disable_internal_queues(struct hl_device *hdev)
   */
  static int goya_stop_internal_queues(struct hl_device *hdev)
  {
+       struct goya_device *goya = hdev->asic_specific;
         int rc, retval = 0;
  
+       if (!(goya->hw_cap_initialized & HW_CAP_MME))
+               goto stop_tpc;
+
         /*
          * Each queue (QMAN) is a separate H/W logic. That means that each
          * QMAN can be stopped independently and failure to stop one does NOT
@@ -1810,6 +1833,10 @@ static int goya_stop_internal_queues(struct hl_device *hdev)
                 retval = -EIO;
         }
  
+stop_tpc:
+       if (!(goya->hw_cap_initialized & HW_CAP_TPC))
+               return retval;
+
         rc = goya_stop_queue(hdev,
                         mmTPC0_QM_GLBL_CFG1,
                         mmTPC0_QM_CP_STS,
@@ -1975,6 +2002,11 @@ static int goya_stop_internal_queues(struct hl_device *hdev)
  
  static void goya_dma_stall(struct hl_device *hdev)
  {
+       struct goya_device *goya = hdev->asic_specific;
+
+       if (!(goya->hw_cap_initialized & HW_CAP_DMA))
+               return;
+
         WREG32(mmDMA_QM_0_GLBL_CFG1, 1 << DMA_QM_0_GLBL_CFG1_DMA_STOP_SHIFT);
         WREG32(mmDMA_QM_1_GLBL_CFG1, 1 << DMA_QM_1_GLBL_CFG1_DMA_STOP_SHIFT);
         WREG32(mmDMA_QM_2_GLBL_CFG1, 1 << DMA_QM_2_GLBL_CFG1_DMA_STOP_SHIFT);
@@ -1984,6 +2016,11 @@ static void goya_dma_stall(struct hl_device *hdev)
  
  static void goya_tpc_stall(struct hl_device *hdev)
  {
+       struct goya_device *goya = hdev->asic_specific;
+
+       if (!(goya->hw_cap_initialized & HW_CAP_TPC))
+               return;
+
         WREG32(mmTPC0_CFG_TPC_STALL, 1 << TPC0_CFG_TPC_STALL_V_SHIFT);
         WREG32(mmTPC1_CFG_TPC_STALL, 1 << TPC1_CFG_TPC_STALL_V_SHIFT);
         WREG32(mmTPC2_CFG_TPC_STALL, 1 << TPC2_CFG_TPC_STALL_V_SHIFT);
@@ -1996,6 +2033,11 @@ static void goya_tpc_stall(struct hl_device *hdev)
  
  static void goya_mme_stall(struct hl_device *hdev)
  {
+       struct goya_device *goya = hdev->asic_specific;
+
+       if (!(goya->hw_cap_initialized & HW_CAP_MME))
+               return;
+
         WREG32(mmMME_STALL, 0xFFFFFFFF);
  }
  
@@ -4648,8 +4690,6 @@ static int goya_memset_device_memory(struct hl_device *hdev, u64 addr, u64 size,
  
         rc = goya_send_job_on_qman0(hdev, job);
  
-       hl_cb_put(job->patched_cb);
-
         hl_debugfs_remove_job(hdev, job);
         kfree(job);
         cb->cs_cnt--;
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c

index 48d5ec770b94242fddb71a02effab6fc1ee988e8..d10805e5e6232b91bf8e125e2734da4629712b1b 100644 (file)
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3526,6 +3526,47 @@ static void bond_fold_stats(struct rtnl_link_stats64 *_res,
         }
  }
  
+#ifdef CONFIG_LOCKDEP
+static int bond_get_lowest_level_rcu(struct net_device *dev)
+{
+       struct net_device *ldev, *next, *now, *dev_stack[MAX_NEST_DEV + 1];
+       struct list_head *niter, *iter, *iter_stack[MAX_NEST_DEV + 1];
+       int cur = 0, max = 0;
+
+       now = dev;
+       iter = &dev->adj_list.lower;
+
+       while (1) {
+               next = NULL;
+               while (1) {
+                       ldev = netdev_next_lower_dev_rcu(now, &iter);
+                       if (!ldev)
+                               break;
+
+                       next = ldev;
+                       niter = &ldev->adj_list.lower;
+                       dev_stack[cur] = now;
+                       iter_stack[cur++] = iter;
+                       if (max <= cur)
+                               max = cur;
+                       break;
+               }
+
+               if (!next) {
+                       if (!cur)
+                               return max;
+                       next = dev_stack[--cur];
+                       niter = iter_stack[cur];
+               }
+
+               now = next;
+               iter = niter;
+       }
+
+       return max;
+}
+#endif
+
  static void bond_get_stats(struct net_device *bond_dev,
                            struct rtnl_link_stats64 *stats)
  {
@@ -3533,11 +3574,17 @@ static void bond_get_stats(struct net_device *bond_dev,
         struct rtnl_link_stats64 temp;
         struct list_head *iter;
         struct slave *slave;
+       int nest_level = 0;
  
-       spin_lock(&bond->stats_lock);
-       memcpy(stats, &bond->bond_stats, sizeof(*stats));
  
         rcu_read_lock();
+#ifdef CONFIG_LOCKDEP
+       nest_level = bond_get_lowest_level_rcu(bond_dev);
+#endif
+
+       spin_lock_nested(&bond->stats_lock, nest_level);
+       memcpy(stats, &bond->bond_stats, sizeof(*stats));
+
         bond_for_each_slave_rcu(bond, slave, iter) {
                 const struct rtnl_link_stats64 *new =
                         dev_get_stats(slave->dev, &temp);
@@ -3547,10 +3594,10 @@ static void bond_get_stats(struct net_device *bond_dev,
                 /* save off the slave stats for the next run */
                 memcpy(&slave->slave_stats, new, sizeof(*new));
         }
-       rcu_read_unlock();
  
         memcpy(&bond->bond_stats, stats, sizeof(*stats));
         spin_unlock(&bond->stats_lock);
+       rcu_read_unlock();
  }
  
  static int bond_do_ioctl(struct net_device *bond_dev, struct ifreq *ifr, int cmd)
@@ -3640,6 +3687,8 @@ static int bond_do_ioctl(struct net_device *bond_dev, struct ifreq *ifr, int cmd
         case BOND_RELEASE_OLD:
         case SIOCBONDRELEASE:
                 res = bond_release(bond_dev, slave_dev);
+               if (!res)
+                       netdev_update_lockdep_key(slave_dev);
                 break;
         case BOND_SETHWADDR_OLD:
         case SIOCBONDSETHWADDR:
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c

index ddb3916d3506bec2fe317973054a32ea5c4255ad..215c1092328937a01531840ea9222fc5fe0e13a2 100644 (file)
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -1398,6 +1398,8 @@ static int bond_option_slaves_set(struct bonding *bond,
         case '-':
                 slave_dbg(bond->dev, dev, "Releasing interface\n");
                 ret = bond_release(bond->dev, dev);
+               if (!ret)
+                       netdev_update_lockdep_key(dev);
                 break;
  
         default:
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c

index 449a22172e079649940a1c3ba76cd30fb6e46ccb..1a69286daa8d8adcc2b2e535c1bd4a02b380a1c0 100644 (file)
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1366,6 +1366,9 @@ void b53_vlan_add(struct dsa_switch *ds, int port,
  
                 b53_get_vlan_entry(dev, vid, vl);
  
+               if (vid == 0 && vid == b53_default_pvid(dev))
+                       untagged = true;
+
                 vl->members |= BIT(port);
                 if (untagged && !dsa_is_cpu_port(ds, port))
                         vl->untag |= BIT(port);
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c

index d1955543acd1d1d151d8292c9638c25de2267ae0..b0f5280a83cb612a1a5cfd62060b189b98a967e3 100644 (file)
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -69,8 +69,7 @@ static void bcm_sf2_imp_setup(struct dsa_switch *ds, int port)
                 /* Force link status for IMP port */
                 reg = core_readl(priv, offset);
                 reg |= (MII_SW_OR | LINK_STS);
-               if (priv->type == BCM7278_DEVICE_ID)
-                       reg |= GMII_SPEED_UP_2G;
+               reg &= ~GMII_SPEED_UP_2G;
                 core_writel(priv, reg, offset);
  
                 /* Enable Broadcast, Multicast, Unicast forwarding to IMP port */
diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h

index f332cb4b2fbf370a8cb0cead4df2892aca1a343d..79cad5e751c6cd4784d0bf67aa3045336da400ad 100644 (file)
--- a/drivers/net/dsa/mv88e6xxx/chip.h
+++ b/drivers/net/dsa/mv88e6xxx/chip.h
@@ -236,7 +236,7 @@ struct mv88e6xxx_port {
         bool mirror_ingress;
         bool mirror_egress;
         unsigned int serdes_irq;
-       char serdes_irq_name[32];
+       char serdes_irq_name[64];
  };
  
  struct mv88e6xxx_chip {
@@ -293,16 +293,16 @@ struct mv88e6xxx_chip {
         struct mv88e6xxx_irq g1_irq;
         struct mv88e6xxx_irq g2_irq;
         int irq;
-       char irq_name[32];
+       char irq_name[64];
         int device_irq;
-       char device_irq_name[32];
+       char device_irq_name[64];
         int watchdog_irq;
-       char watchdog_irq_name[32];
+       char watchdog_irq_name[64];
  
         int atu_prob_irq;
-       char atu_prob_irq_name[32];
+       char atu_prob_irq_name[64];
         int vtu_prob_irq;
-       char vtu_prob_irq_name[32];
+       char vtu_prob_irq_name[64];
         struct kthread_worker *kworker;
         struct kthread_delayed_work irq_poll_work;
  
diff --git a/drivers/net/dsa/mv88e6xxx/global1.c b/drivers/net/dsa/mv88e6xxx/global1.c

index b016cc205f81f02c1c27a43a5f7618770c7a15ae..ca3a7a7a73c32700bde1206c3122f967edc58842 100644 (file)
--- a/drivers/net/dsa/mv88e6xxx/global1.c
+++ b/drivers/net/dsa/mv88e6xxx/global1.c
@@ -278,13 +278,13 @@ int mv88e6095_g1_set_egress_port(struct mv88e6xxx_chip *chip,
         switch (direction) {
         case MV88E6XXX_EGRESS_DIR_INGRESS:
                 dest_port_chip = &chip->ingress_dest_port;
-               reg &= MV88E6185_G1_MONITOR_CTL_INGRESS_DEST_MASK;
+               reg &= ~MV88E6185_G1_MONITOR_CTL_INGRESS_DEST_MASK;
                 reg |= port <<
                        __bf_shf(MV88E6185_G1_MONITOR_CTL_INGRESS_DEST_MASK);
                 break;
         case MV88E6XXX_EGRESS_DIR_EGRESS:
                 dest_port_chip = &chip->egress_dest_port;
-               reg &= MV88E6185_G1_MONITOR_CTL_EGRESS_DEST_MASK;
+               reg &= ~MV88E6185_G1_MONITOR_CTL_EGRESS_DEST_MASK;
                 reg |= port <<
                        __bf_shf(MV88E6185_G1_MONITOR_CTL_EGRESS_DEST_MASK);
                 break;
diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c

index ea62604fdf8ca7b0dc647b986ac5ea998f7bc414..1fb58f9ad80b83243455e5a8d80501a692c80ff6 100644 (file)
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -200,6 +200,11 @@ static void comp_ctxt_release(struct ena_com_admin_queue *queue,
  static struct ena_comp_ctx *get_comp_ctxt(struct ena_com_admin_queue *queue,
                                           u16 command_id, bool capture)
  {
+       if (unlikely(!queue->comp_ctx)) {
+               pr_err("Completion context is NULL\n");
+               return NULL;
+       }
+
         if (unlikely(command_id >= queue->q_depth)) {
                 pr_err("command id is larger than the queue size. cmd_id: %u queue size %d\n",
                        command_id, queue->q_depth);
@@ -1041,9 +1046,41 @@ static int ena_com_get_feature(struct ena_com_dev *ena_dev,
                                       feature_ver);
  }
  
+int ena_com_get_current_hash_function(struct ena_com_dev *ena_dev)
+{
+       return ena_dev->rss.hash_func;
+}
+
+static void ena_com_hash_key_fill_default_key(struct ena_com_dev *ena_dev)
+{
+       struct ena_admin_feature_rss_flow_hash_control *hash_key =
+               (ena_dev->rss).hash_key;
+
+       netdev_rss_key_fill(&hash_key->key, sizeof(hash_key->key));
+       /* The key is stored in the device in u32 array
+        * as well as the API requires the key to be passed in this
+        * format. Thus the size of our array should be divided by 4
+        */
+       hash_key->keys_num = sizeof(hash_key->key) / sizeof(u32);
+}
+
  static int ena_com_hash_key_allocate(struct ena_com_dev *ena_dev)
  {
         struct ena_rss *rss = &ena_dev->rss;
+       struct ena_admin_feature_rss_flow_hash_control *hash_key;
+       struct ena_admin_get_feat_resp get_resp;
+       int rc;
+
+       hash_key = (ena_dev->rss).hash_key;
+
+       rc = ena_com_get_feature_ex(ena_dev, &get_resp,
+                                   ENA_ADMIN_RSS_HASH_FUNCTION,
+                                   ena_dev->rss.hash_key_dma_addr,
+                                   sizeof(ena_dev->rss.hash_key), 0);
+       if (unlikely(rc)) {
+               hash_key = NULL;
+               return -EOPNOTSUPP;
+       }
  
         rss->hash_key =
                 dma_alloc_coherent(ena_dev->dmadev, sizeof(*rss->hash_key),
@@ -1254,30 +1291,6 @@ static int ena_com_ind_tbl_convert_to_device(struct ena_com_dev *ena_dev)
         return 0;
  }
  
-static int ena_com_ind_tbl_convert_from_device(struct ena_com_dev *ena_dev)
-{
-       u16 dev_idx_to_host_tbl[ENA_TOTAL_NUM_QUEUES] = { (u16)-1 };
-       struct ena_rss *rss = &ena_dev->rss;
-       u8 idx;
-       u16 i;
-
-       for (i = 0; i < ENA_TOTAL_NUM_QUEUES; i++)
-               dev_idx_to_host_tbl[ena_dev->io_sq_queues[i].idx] = i;
-
-       for (i = 0; i < 1 << rss->tbl_log_size; i++) {
-               if (rss->rss_ind_tbl[i].cq_idx > ENA_TOTAL_NUM_QUEUES)
-                       return -EINVAL;
-               idx = (u8)rss->rss_ind_tbl[i].cq_idx;
-
-               if (dev_idx_to_host_tbl[idx] > ENA_TOTAL_NUM_QUEUES)
-                       return -EINVAL;
-
-               rss->host_rss_ind_tbl[i] = dev_idx_to_host_tbl[idx];
-       }
-
-       return 0;
-}
-
  static void ena_com_update_intr_delay_resolution(struct ena_com_dev *ena_dev,
                                                  u16 intr_delay_resolution)
  {
@@ -2297,15 +2310,16 @@ int ena_com_fill_hash_function(struct ena_com_dev *ena_dev,
  
         switch (func) {
         case ENA_ADMIN_TOEPLITZ:
-               if (key_len > sizeof(hash_key->key)) {
-                       pr_err("key len (%hu) is bigger than the max supported (%zu)\n",
-                              key_len, sizeof(hash_key->key));
-                       return -EINVAL;
+               if (key) {
+                       if (key_len != sizeof(hash_key->key)) {
+                               pr_err("key len (%hu) doesn't equal the supported size (%zu)\n",
+                                      key_len, sizeof(hash_key->key));
+                               return -EINVAL;
+                       }
+                       memcpy(hash_key->key, key, key_len);
+                       rss->hash_init_val = init_val;
+                       hash_key->keys_num = key_len >> 2;
                 }
-
-               memcpy(hash_key->key, key, key_len);
-               rss->hash_init_val = init_val;
-               hash_key->keys_num = key_len >> 2;
                 break;
         case ENA_ADMIN_CRC32:
                 rss->hash_init_val = init_val;
@@ -2342,7 +2356,11 @@ int ena_com_get_hash_function(struct ena_com_dev *ena_dev,
         if (unlikely(rc))
                 return rc;
  
-       rss->hash_func = get_resp.u.flow_hash_func.selected_func;
+       /* ffs() returns 1 in case the lsb is set */
+       rss->hash_func = ffs(get_resp.u.flow_hash_func.selected_func);
+       if (rss->hash_func)
+               rss->hash_func--;
+
         if (func)
                 *func = rss->hash_func;
  
@@ -2606,10 +2624,6 @@ int ena_com_indirect_table_get(struct ena_com_dev *ena_dev, u32 *ind_tbl)
         if (!ind_tbl)
                 return 0;
  
-       rc = ena_com_ind_tbl_convert_from_device(ena_dev);
-       if (unlikely(rc))
-               return rc;
-
         for (i = 0; i < (1 << rss->tbl_log_size); i++)
                 ind_tbl[i] = rss->host_rss_ind_tbl[i];
  
@@ -2626,9 +2640,15 @@ int ena_com_rss_init(struct ena_com_dev *ena_dev, u16 indr_tbl_log_size)
         if (unlikely(rc))
                 goto err_indr_tbl;
  
+       /* The following function might return unsupported in case the
+        * device doesn't support setting the key / hash function. We can safely
+        * ignore this error and have indirection table support only.
+        */
         rc = ena_com_hash_key_allocate(ena_dev);
-       if (unlikely(rc))
+       if (unlikely(rc) && rc != -EOPNOTSUPP)
                 goto err_hash_key;
+       else if (rc != -EOPNOTSUPP)
+               ena_com_hash_key_fill_default_key(ena_dev);
  
         rc = ena_com_hash_ctrl_init(ena_dev);
         if (unlikely(rc))
diff --git a/drivers/net/ethernet/amazon/ena/ena_com.h b/drivers/net/ethernet/amazon/ena/ena_com.h

index 0ce37d54ed108f0a34167306247ba94eb9833050..469f298199a7b9a7c67ed4934ad6a503c40a1803 100644 (file)
--- a/drivers/net/ethernet/amazon/ena/ena_com.h
+++ b/drivers/net/ethernet/amazon/ena/ena_com.h
@@ -44,6 +44,7 @@
  #include <linux/spinlock.h>
  #include <linux/types.h>
  #include <linux/wait.h>
+#include <linux/netdevice.h>
  
  #include "ena_common_defs.h"
  #include "ena_admin_defs.h"
@@ -655,6 +656,14 @@ int ena_com_rss_init(struct ena_com_dev *ena_dev, u16 log_size);
   */
  void ena_com_rss_destroy(struct ena_com_dev *ena_dev);
  
+/* ena_com_get_current_hash_function - Get RSS hash function
+ * @ena_dev: ENA communication layer struct
+ *
+ * Return the current hash function.
+ * @return: 0 or one of the ena_admin_hash_functions values.
+ */
+int ena_com_get_current_hash_function(struct ena_com_dev *ena_dev);
+
  /* ena_com_fill_hash_function - Fill RSS hash function
   * @ena_dev: ENA communication layer struct
   * @func: The hash function (Toeplitz or crc)
diff --git a/drivers/net/ethernet/amazon/ena/ena_ethtool.c b/drivers/net/ethernet/amazon/ena/ena_ethtool.c

index b4e891d49a941cc3a36d0a42ac60f694ba9b2794..ced1d577b62a97c0dd3e23bbedea0994e8d4b93c 100644 (file)
--- a/drivers/net/ethernet/amazon/ena/ena_ethtool.c
+++ b/drivers/net/ethernet/amazon/ena/ena_ethtool.c
@@ -636,6 +636,28 @@ static u32 ena_get_rxfh_key_size(struct net_device *netdev)
         return ENA_HASH_KEY_SIZE;
  }
  
+static int ena_indirection_table_get(struct ena_adapter *adapter, u32 *indir)
+{
+       struct ena_com_dev *ena_dev = adapter->ena_dev;
+       int i, rc;
+
+       if (!indir)
+               return 0;
+
+       rc = ena_com_indirect_table_get(ena_dev, indir);
+       if (rc)
+               return rc;
+
+       /* Our internal representation of the indices is: even indices
+        * for Tx and uneven indices for Rx. We need to convert the Rx
+        * indices to be consecutive
+        */
+       for (i = 0; i < ENA_RX_RSS_TABLE_SIZE; i++)
+               indir[i] = ENA_IO_RXQ_IDX_TO_COMBINED_IDX(indir[i]);
+
+       return rc;
+}
+
  static int ena_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key,
                         u8 *hfunc)
  {
@@ -644,11 +666,25 @@ static int ena_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key,
         u8 func;
         int rc;
  
-       rc = ena_com_indirect_table_get(adapter->ena_dev, indir);
+       rc = ena_indirection_table_get(adapter, indir);
         if (rc)
                 return rc;
  
+       /* We call this function in order to check if the device
+        * supports getting/setting the hash function.
+        */
         rc = ena_com_get_hash_function(adapter->ena_dev, &ena_func, key);
+
+       if (rc) {
+               if (rc == -EOPNOTSUPP) {
+                       key = NULL;
+                       hfunc = NULL;
+                       rc = 0;
+               }
+
+               return rc;
+       }
+
         if (rc)
                 return rc;
  
@@ -657,7 +693,7 @@ static int ena_get_rxfh(struct net_device *netdev, u32 *indir, u8 *key,
                 func = ETH_RSS_HASH_TOP;
                 break;
         case ENA_ADMIN_CRC32:
-               func = ETH_RSS_HASH_XOR;
+               func = ETH_RSS_HASH_CRC32;
                 break;
         default:
                 netif_err(adapter, drv, netdev,
@@ -700,10 +736,13 @@ static int ena_set_rxfh(struct net_device *netdev, const u32 *indir,
         }
  
         switch (hfunc) {
+       case ETH_RSS_HASH_NO_CHANGE:
+               func = ena_com_get_current_hash_function(ena_dev);
+               break;
         case ETH_RSS_HASH_TOP:
                 func = ENA_ADMIN_TOEPLITZ;
                 break;
-       case ETH_RSS_HASH_XOR:
+       case ETH_RSS_HASH_CRC32:
                 func = ENA_ADMIN_CRC32;
                 break;
         default:
@@ -814,6 +853,7 @@ static const struct ethtool_ops ena_ethtool_ops = {
         .set_channels           = ena_set_channels,
         .get_tunable            = ena_get_tunable,
         .set_tunable            = ena_set_tunable,
+       .get_ts_info            = ethtool_op_get_ts_info,
  };
  
  void ena_set_ethtool_ops(struct net_device *netdev)
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c

index 894e8c1a8cf15f7ab33eb3d2cb326507255ce5a6..0b2fd96b93d7f23e71d8fe599d2bb4503f01cd73 100644 (file)
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -3706,8 +3706,8 @@ static void check_for_missing_keep_alive(struct ena_adapter *adapter)
         if (adapter->keep_alive_timeout == ENA_HW_HINTS_NO_TIMEOUT)
                 return;
  
-       keep_alive_expired = round_jiffies(adapter->last_keep_alive_jiffies +
-                                          adapter->keep_alive_timeout);
+       keep_alive_expired = adapter->last_keep_alive_jiffies +
+                            adapter->keep_alive_timeout;
         if (unlikely(time_is_before_jiffies(keep_alive_expired))) {
                 netif_err(adapter, drv, adapter->netdev,
                           "Keep alive watchdog timeout.\n");
@@ -3809,7 +3809,7 @@ static void ena_timer_service(struct timer_list *t)
         }
  
         /* Reset the timer */
-       mod_timer(&adapter->timer_service, jiffies + HZ);
+       mod_timer(&adapter->timer_service, round_jiffies(jiffies + HZ));
  }
  
  static int ena_calc_max_io_queue_num(struct pci_dev *pdev,
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.h b/drivers/net/ethernet/amazon/ena/ena_netdev.h

index 094324fd0edc194909c7afe65f128ffabf3fe7a5..8795e0b1dc3c0522c67c3978125be788d60b852e 100644 (file)
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.h
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.h
@@ -130,6 +130,8 @@
  
  #define ENA_IO_TXQ_IDX(q)      (2 * (q))
  #define ENA_IO_RXQ_IDX(q)      (2 * (q) + 1)
+#define ENA_IO_TXQ_IDX_TO_COMBINED_IDX(q)      ((q) / 2)
+#define ENA_IO_RXQ_IDX_TO_COMBINED_IDX(q)      (((q) - 1) / 2)
  
  #define ENA_MGMNT_IRQ_IDX              0
  #define ENA_IO_IRQ_FIRST_IDX           1
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c b/drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c

index a1f99bef4a683286bda9c141f9fed6e7953b68b0..7b55633d2cb939d23f08cb442e380cc215464115 100644 (file)
--- a/drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c
@@ -722,6 +722,11 @@ static int aq_ethtool_set_priv_flags(struct net_device *ndev, u32 flags)
         if (flags & ~AQ_PRIV_FLAGS_MASK)
                 return -EOPNOTSUPP;
  
+       if (hweight32((flags | priv_flags) & AQ_HW_LOOPBACK_MASK) > 1) {
+               netdev_info(ndev, "Can't enable more than one loopback simultaneously\n");
+               return -EINVAL;
+       }
+
         cfg->priv_flags = flags;
  
         if ((priv_flags ^ flags) & BIT(AQ_HW_LOOPBACK_DMA_NET)) {
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_filters.c b/drivers/net/ethernet/aquantia/atlantic/aq_filters.c

index 6102251bb909b02c3736b9e0f6f204b40eeba286..03ff92bc4a7fb11c97e4e128609ada081b0dfb44 100644 (file)
--- a/drivers/net/ethernet/aquantia/atlantic/aq_filters.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_filters.c
@@ -163,7 +163,7 @@ aq_check_approve_fvlan(struct aq_nic_s *aq_nic,
         }
  
         if ((aq_nic->ndev->features & NETIF_F_HW_VLAN_CTAG_FILTER) &&
-           (!test_bit(be16_to_cpu(fsp->h_ext.vlan_tci),
+           (!test_bit(be16_to_cpu(fsp->h_ext.vlan_tci) & VLAN_VID_MASK,
                        aq_nic->active_vlans))) {
                 netdev_err(aq_nic->ndev,
                            "ethtool: unknown vlan-id specified");
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_hw.h b/drivers/net/ethernet/aquantia/atlantic/aq_hw.h

index cc70c606b6ef292fa61cd915942ee88f9ac40091..251767c31f7e59e47250ea96a5a1d1d4cfa1f056 100644 (file)
--- a/drivers/net/ethernet/aquantia/atlantic/aq_hw.h
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_hw.h
@@ -337,6 +337,8 @@ struct aq_fw_ops {
  
         void (*enable_ptp)(struct aq_hw_s *self, int enable);
  
+       void (*adjust_ptp)(struct aq_hw_s *self, uint64_t adj);
+
         int (*set_eee_rate)(struct aq_hw_s *self, u32 speed);
  
         int (*get_eee_rate)(struct aq_hw_s *self, u32 *rate,
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c

index c85e3e29012c0be813dde299aa8731653362a791..e95f6a6bef733d1f944af19b852e18971795c21c 100644 (file)
--- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
@@ -533,8 +533,10 @@ unsigned int aq_nic_map_skb(struct aq_nic_s *self, struct sk_buff *skb,
                                      dx_buff->len,
                                      DMA_TO_DEVICE);
  
-       if (unlikely(dma_mapping_error(aq_nic_get_dev(self), dx_buff->pa)))
+       if (unlikely(dma_mapping_error(aq_nic_get_dev(self), dx_buff->pa))) {
+               ret = 0;
                 goto exit;
+       }
  
         first = dx_buff;
         dx_buff->len_pkt = skb->len;
@@ -655,10 +657,6 @@ int aq_nic_xmit(struct aq_nic_s *self, struct sk_buff *skb)
         if (likely(frags)) {
                 err = self->aq_hw_ops->hw_ring_tx_xmit(self->aq_hw,
                                                        ring, frags);
-               if (err >= 0) {
-                       ++ring->stats.tx.packets;
-                       ring->stats.tx.bytes += skb->len;
-               }
         } else {
                 err = NETDEV_TX_BUSY;
         }
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c b/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c

index 6b27af0db4992888ec1d2648ef6b51f5c89aabc3..78b6f32487565f30069c12565d92dd73c182c9de 100644 (file)
--- a/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c
@@ -359,7 +359,8 @@ static int aq_suspend_common(struct device *dev, bool deep)
         netif_device_detach(nic->ndev);
         netif_tx_stop_all_queues(nic->ndev);
  
-       aq_nic_stop(nic);
+       if (netif_running(nic->ndev))
+               aq_nic_stop(nic);
  
         if (deep) {
                 aq_nic_deinit(nic, !nic->aq_hw->aq_nic_cfg->wol);
@@ -375,7 +376,7 @@ static int atl_resume_common(struct device *dev, bool deep)
  {
         struct pci_dev *pdev = to_pci_dev(dev);
         struct aq_nic_s *nic;
-       int ret;
+       int ret = 0;
  
         nic = pci_get_drvdata(pdev);
  
@@ -390,9 +391,11 @@ static int atl_resume_common(struct device *dev, bool deep)
                         goto err_exit;
         }
  
-       ret = aq_nic_start(nic);
-       if (ret)
-               goto err_exit;
+       if (netif_running(nic->ndev)) {
+               ret = aq_nic_start(nic);
+               if (ret)
+                       goto err_exit;
+       }
  
         netif_device_attach(nic->ndev);
         netif_tx_start_all_queues(nic->ndev);
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c

index 951d86f8b66e8c9114510d41436c030cf7bffeb8..bae95a61856081afc83e36c43fd1b1bc59634d7a 100644 (file)
--- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
@@ -272,9 +272,12 @@ bool aq_ring_tx_clean(struct aq_ring_s *self)
                         }
                 }
  
-               if (unlikely(buff->is_eop))
-                       dev_kfree_skb_any(buff->skb);
+               if (unlikely(buff->is_eop)) {
+                       ++self->stats.rx.packets;
+                       self->stats.tx.bytes += buff->skb->len;
  
+                       dev_kfree_skb_any(buff->skb);
+               }
                 buff->pa = 0U;
                 buff->eop_index = 0xffffU;
                 self->sw_head = aq_ring_next_dx(self, self->sw_head);
@@ -351,7 +354,8 @@ int aq_ring_rx_clean(struct aq_ring_s *self,
                                 err = 0;
                                 goto err_exit;
                         }
-                       if (buff->is_error || buff->is_cso_err) {
+                       if (buff->is_error ||
+                           (buff->is_lro && buff->is_cso_err)) {
                                 buff_ = buff;
                                 do {
                                         next_ = buff_->next,
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ring.h b/drivers/net/ethernet/aquantia/atlantic/aq_ring.h

index 991e4d31b0948e86e21299c069d3bc70a738b3b0..2c96f20f62891dbec2184039cce3a7de952a6b14 100644 (file)
--- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.h
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.h
@@ -78,7 +78,8 @@ struct __packed aq_ring_buff_s {
                         u32 is_cleaned:1;
                         u32 is_error:1;
                         u32 is_vlan:1;
-                       u32 rsvd3:4;
+                       u32 is_lro:1;
+                       u32 rsvd3:3;
                         u16 eop_index;
                         u16 rsvd4;
                 };
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c

index ec041f78d0634426d76e23a23678c6be8321a69b..d20d91cdece861adeb0b45c44a69a8140cb048e4 100644 (file)
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
@@ -823,6 +823,8 @@ static int hw_atl_b0_hw_ring_rx_receive(struct aq_hw_s *self,
                         }
                 }
  
+               buff->is_lro = !!(HW_ATL_B0_RXD_WB_STAT2_RSCCNT &
+                                 rxd_wb->status);
                 if (HW_ATL_B0_RXD_WB_STAT2_EOP & rxd_wb->status) {
                         buff->len = rxd_wb->pkt_len %
                                 AQ_CFG_RX_FRAME_MAX;
@@ -835,8 +837,7 @@ static int hw_atl_b0_hw_ring_rx_receive(struct aq_hw_s *self,
                                 rxd_wb->pkt_len > AQ_CFG_RX_FRAME_MAX ?
                                 AQ_CFG_RX_FRAME_MAX : rxd_wb->pkt_len;
  
-                       if (HW_ATL_B0_RXD_WB_STAT2_RSCCNT &
-                               rxd_wb->status) {
+                       if (buff->is_lro) {
                                 /* LRO */
                                 buff->next = rxd_wb->next_desc_ptr;
                                 ++ring->stats.rx.lro_packets;
@@ -884,13 +885,16 @@ static int hw_atl_b0_hw_packet_filter_set(struct aq_hw_s *self,
  {
         struct aq_nic_cfg_s *cfg = self->aq_nic_cfg;
         unsigned int i = 0U;
+       u32 vlan_promisc;
+       u32 l2_promisc;
  
-       hw_atl_rpfl2promiscuous_mode_en_set(self,
-                                           IS_FILTER_ENABLED(IFF_PROMISC));
+       l2_promisc = IS_FILTER_ENABLED(IFF_PROMISC) ||
+                    !!(cfg->priv_flags & BIT(AQ_HW_LOOPBACK_DMA_NET));
+       vlan_promisc = l2_promisc || cfg->is_vlan_force_promisc;
  
-       hw_atl_rpf_vlan_prom_mode_en_set(self,
-                                    IS_FILTER_ENABLED(IFF_PROMISC) ||
-                                    cfg->is_vlan_force_promisc);
+       hw_atl_rpfl2promiscuous_mode_en_set(self, l2_promisc);
+
+       hw_atl_rpf_vlan_prom_mode_en_set(self, vlan_promisc);
  
         hw_atl_rpfl2multicast_flr_en_set(self,
                                          IS_FILTER_ENABLED(IFF_ALLMULTI) &&
@@ -1161,6 +1165,8 @@ static int hw_atl_b0_adj_sys_clock(struct aq_hw_s *self, s64 delta)
  {
         self->ptp_clk_offset += delta;
  
+       self->aq_fw_ops->adjust_ptp(self, self->ptp_clk_offset);
+
         return 0;
  }
  
@@ -1211,7 +1217,7 @@ static int hw_atl_b0_gpio_pulse(struct aq_hw_s *self, u32 index,
         fwreq.ptp_gpio_ctrl.index = index;
         fwreq.ptp_gpio_ctrl.period = period;
         /* Apply time offset */
-       fwreq.ptp_gpio_ctrl.start = start - self->ptp_clk_offset;
+       fwreq.ptp_gpio_ctrl.start = start;
  
         size = sizeof(fwreq.msg_id) + sizeof(fwreq.ptp_gpio_ctrl);
         return self->aq_fw_ops->send_fw_request(self, &fwreq, size);
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c

index f547baa6c95499e5f16313ea47b2c04f18f105ca..354705f9bc493afcbe9c63f28cc36f779eea4be5 100644 (file)
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c
@@ -22,6 +22,7 @@
  #define HW_ATL_MIF_ADDR         0x0208U
  #define HW_ATL_MIF_VAL          0x020CU
  
+#define HW_ATL_MPI_RPC_ADDR     0x0334U
  #define HW_ATL_RPC_CONTROL_ADR  0x0338U
  #define HW_ATL_RPC_STATE_ADR    0x033CU
  
@@ -53,15 +54,14 @@ enum mcp_area {
  };
  
  static int hw_atl_utils_ver_match(u32 ver_expected, u32 ver_actual);
-
  static int hw_atl_utils_mpi_set_state(struct aq_hw_s *self,
                                       enum hal_atl_utils_fw_state_e state);
-
  static u32 hw_atl_utils_get_mpi_mbox_tid(struct aq_hw_s *self);
  static u32 hw_atl_utils_mpi_get_state(struct aq_hw_s *self);
  static u32 hw_atl_utils_mif_cmd_get(struct aq_hw_s *self);
  static u32 hw_atl_utils_mif_addr_get(struct aq_hw_s *self);
  static u32 hw_atl_utils_rpc_state_get(struct aq_hw_s *self);
+static u32 aq_fw1x_rpc_get(struct aq_hw_s *self);
  
  int hw_atl_utils_initfw(struct aq_hw_s *self, const struct aq_fw_ops **fw_ops)
  {
@@ -476,6 +476,10 @@ static int hw_atl_utils_init_ucp(struct aq_hw_s *self,
                                         self, self->mbox_addr,
                                         self->mbox_addr != 0U,
                                         1000U, 10000U);
+       err = readx_poll_timeout_atomic(aq_fw1x_rpc_get, self,
+                                       self->rpc_addr,
+                                       self->rpc_addr != 0U,
+                                       1000U, 100000U);
  
         return err;
  }
@@ -531,6 +535,12 @@ int hw_atl_utils_fw_rpc_wait(struct aq_hw_s *self,
                                                 self, fw.val,
                                                 sw.tid == fw.tid,
                                                 1000U, 100000U);
+               if (err < 0)
+                       goto err_exit;
+
+               err = aq_hw_err_from_flags(self);
+               if (err < 0)
+                       goto err_exit;
  
                 if (fw.len == 0xFFFFU) {
                         err = hw_atl_utils_fw_rpc_call(self, sw.len);
@@ -1025,6 +1035,11 @@ static u32 hw_atl_utils_rpc_state_get(struct aq_hw_s *self)
         return aq_hw_read_reg(self, HW_ATL_RPC_STATE_ADR);
  }
  
+static u32 aq_fw1x_rpc_get(struct aq_hw_s *self)
+{
+       return aq_hw_read_reg(self, HW_ATL_MPI_RPC_ADDR);
+}
+
  const struct aq_fw_ops aq_fw_1x_ops = {
         .init = hw_atl_utils_mpi_create,
         .deinit = hw_atl_fw1x_deinit,
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils_fw2x.c b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils_fw2x.c

index 97ebf849695fdb0b9cc386c535b3ff34e7fba3c4..77a4ed64830fd1e1c04814f1b18aecf99f0f0dd3 100644 (file)
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils_fw2x.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils_fw2x.c
@@ -30,6 +30,9 @@
  #define HW_ATL_FW3X_EXT_CONTROL_ADDR     0x378
  #define HW_ATL_FW3X_EXT_STATE_ADDR       0x37c
  
+#define HW_ATL_FW3X_PTP_ADJ_LSW_ADDR    0x50a0
+#define HW_ATL_FW3X_PTP_ADJ_MSW_ADDR    0x50a4
+
  #define HW_ATL_FW2X_CAP_PAUSE            BIT(CAPS_HI_PAUSE)
  #define HW_ATL_FW2X_CAP_ASYM_PAUSE       BIT(CAPS_HI_ASYMMETRIC_PAUSE)
  #define HW_ATL_FW2X_CAP_SLEEP_PROXY      BIT(CAPS_HI_SLEEP_PROXY)
@@ -475,6 +478,14 @@ static void aq_fw3x_enable_ptp(struct aq_hw_s *self, int enable)
         aq_hw_write_reg(self, HW_ATL_FW3X_EXT_CONTROL_ADDR, ptp_opts);
  }
  
+static void aq_fw3x_adjust_ptp(struct aq_hw_s *self, uint64_t adj)
+{
+       aq_hw_write_reg(self, HW_ATL_FW3X_PTP_ADJ_LSW_ADDR,
+                       (adj >>  0) & 0xffffffff);
+       aq_hw_write_reg(self, HW_ATL_FW3X_PTP_ADJ_MSW_ADDR,
+                       (adj >> 32) & 0xffffffff);
+}
+
  static int aq_fw2x_led_control(struct aq_hw_s *self, u32 mode)
  {
         if (self->fw_ver_actual < HW_ATL_FW_VER_LED)
@@ -633,4 +644,5 @@ const struct aq_fw_ops aq_fw_2x_ops = {
         .enable_ptp         = aq_fw3x_enable_ptp,
         .led_control        = aq_fw2x_led_control,
         .set_phyloopback    = aq_fw2x_set_phyloopback,
+       .adjust_ptp         = aq_fw3x_adjust_ptp,
  };
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c

index 597e6fd5bfea8344d21efc5da7918b3291ba3601..f9a8151f092c726dc553ec60459c6ba941e4a6e6 100644 (file)
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -11252,7 +11252,7 @@ static void bnxt_cfg_ntp_filters(struct bnxt *bp)
                 }
         }
         if (test_and_clear_bit(BNXT_HWRM_PF_UNLOAD_SP_EVENT, &bp->sp_event))
-               netdev_info(bp->dev, "Receive PF driver unload event!");
+               netdev_info(bp->dev, "Receive PF driver unload event!\n");
  }
  
  #else
@@ -11759,7 +11759,7 @@ static int bnxt_pcie_dsn_get(struct bnxt *bp, u8 dsn[])
         u32 dw;
  
         if (!pos) {
-               netdev_info(bp->dev, "Unable do read adapter's DSN");
+               netdev_info(bp->dev, "Unable do read adapter's DSN\n");
                 return -EOPNOTSUPP;
         }
  
@@ -11786,6 +11786,14 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
         if (version_printed++ == 0)
                 pr_info("%s", version);
  
+       /* Clear any pending DMA transactions from crash kernel
+        * while loading driver in capture kernel.
+        */
+       if (is_kdump_kernel()) {
+               pci_clear_master(pdev);
+               pcie_flr(pdev);
+       }
+
         max_irqs = bnxt_get_max_irq(pdev);
         dev = alloc_etherdev_mq(sizeof(*bp), max_irqs);
         if (!dev)
@@ -11983,10 +11991,10 @@ static void bnxt_shutdown(struct pci_dev *pdev)
                 dev_close(dev);
  
         bnxt_ulp_shutdown(bp);
+       bnxt_clear_int_mode(bp);
+       pci_disable_device(pdev);
  
         if (system_state == SYSTEM_POWER_OFF) {
-               bnxt_clear_int_mode(bp);
-               pci_disable_device(pdev);
                 pci_wake_from_d3(pdev, bp->wol);
                 pci_set_power_state(pdev, PCI_D3hot);
         }
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c

index eec0168330b757af7e598f1608342c2161bfbc2e..d3c93ccee86ad9570cbfbdc3762b47a7ff903a26 100644 (file)
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_devlink.c
@@ -641,14 +641,14 @@ static int bnxt_dl_params_register(struct bnxt *bp)
         rc = devlink_params_register(bp->dl, bnxt_dl_params,
                                      ARRAY_SIZE(bnxt_dl_params));
         if (rc) {
-               netdev_warn(bp->dev, "devlink_params_register failed. rc=%d",
+               netdev_warn(bp->dev, "devlink_params_register failed. rc=%d\n",
                             rc);
                 return rc;
         }
         rc = devlink_port_params_register(&bp->dl_port, bnxt_dl_port_params,
                                           ARRAY_SIZE(bnxt_dl_port_params));
         if (rc) {
-               netdev_err(bp->dev, "devlink_port_params_register failed");
+               netdev_err(bp->dev, "devlink_port_params_register failed\n");
                 devlink_params_unregister(bp->dl, bnxt_dl_params,
                                           ARRAY_SIZE(bnxt_dl_params));
                 return rc;
@@ -679,7 +679,7 @@ int bnxt_dl_register(struct bnxt *bp)
         else
                 dl = devlink_alloc(&bnxt_vf_dl_ops, sizeof(struct bnxt_dl));
         if (!dl) {
-               netdev_warn(bp->dev, "devlink_alloc failed");
+               netdev_warn(bp->dev, "devlink_alloc failed\n");
                 return -ENOMEM;
         }
  
@@ -692,7 +692,7 @@ int bnxt_dl_register(struct bnxt *bp)
  
         rc = devlink_register(dl, &bp->pdev->dev);
         if (rc) {
-               netdev_warn(bp->dev, "devlink_register failed. rc=%d", rc);
+               netdev_warn(bp->dev, "devlink_register failed. rc=%d\n", rc);
                 goto err_dl_free;
         }
  
@@ -704,7 +704,7 @@ int bnxt_dl_register(struct bnxt *bp)
                                sizeof(bp->dsn));
         rc = devlink_port_register(dl, &bp->dl_port, bp->pf.port_id);
         if (rc) {
-               netdev_err(bp->dev, "devlink_port_register failed");
+               netdev_err(bp->dev, "devlink_port_register failed\n");
                 goto err_dl_unreg;
         }
  
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c

index 6171fa8b3677b4097d3a10278dced2693b5a340b..e8fc1671c5815e761f80f70a80cd911c115e80ae 100644 (file)
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -2028,7 +2028,7 @@ int bnxt_flash_package_from_file(struct net_device *dev, const char *filename,
         }
  
         if (fw->size > item_len) {
-               netdev_err(dev, "PKG insufficient update area in nvram: %lu",
+               netdev_err(dev, "PKG insufficient update area in nvram: %lu\n",
                            (unsigned long)fw->size);
                 rc = -EFBIG;
         } else {
@@ -3338,7 +3338,7 @@ err:
         kfree(coredump.data);
         *dump_len += sizeof(struct bnxt_coredump_record);
         if (rc == -ENOBUFS)
-               netdev_err(bp->dev, "Firmware returned large coredump buffer");
+               netdev_err(bp->dev, "Firmware returned large coredump buffer\n");
         return rc;
  }
  
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c

index 0cc6ec51f45fe2ab028bb3c0299dbef0fd3bf088..9bec256b0934afd7285a45432d3fc461246216cb 100644 (file)
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
@@ -50,7 +50,7 @@ static u16 bnxt_flow_get_dst_fid(struct bnxt *pf_bp, struct net_device *dev)
  
         /* check if dev belongs to the same switch */
         if (!netdev_port_same_parent_id(pf_bp->dev, dev)) {
-               netdev_info(pf_bp->dev, "dev(ifindex=%d) not on same switch",
+               netdev_info(pf_bp->dev, "dev(ifindex=%d) not on same switch\n",
                             dev->ifindex);
                 return BNXT_FID_INVALID;
         }
@@ -70,7 +70,7 @@ static int bnxt_tc_parse_redir(struct bnxt *bp,
         struct net_device *dev = act->dev;
  
         if (!dev) {
-               netdev_info(bp->dev, "no dev in mirred action");
+               netdev_info(bp->dev, "no dev in mirred action\n");
                 return -EINVAL;
         }
  
@@ -106,7 +106,7 @@ static int bnxt_tc_parse_tunnel_set(struct bnxt *bp,
         const struct ip_tunnel_key *tun_key = &tun_info->key;
  
         if (ip_tunnel_info_af(tun_info) != AF_INET) {
-               netdev_info(bp->dev, "only IPv4 tunnel-encap is supported");
+               netdev_info(bp->dev, "only IPv4 tunnel-encap is supported\n");
                 return -EOPNOTSUPP;
         }
  
@@ -295,7 +295,7 @@ static int bnxt_tc_parse_actions(struct bnxt *bp,
         int i, rc;
  
         if (!flow_action_has_entries(flow_action)) {
-               netdev_info(bp->dev, "no actions");
+               netdev_info(bp->dev, "no actions\n");
                 return -EINVAL;
         }
  
@@ -370,7 +370,7 @@ static int bnxt_tc_parse_flow(struct bnxt *bp,
         /* KEY_CONTROL and KEY_BASIC are needed for forming a meaningful key */
         if ((dissector->used_keys & BIT(FLOW_DISSECTOR_KEY_CONTROL)) == 0 ||
             (dissector->used_keys & BIT(FLOW_DISSECTOR_KEY_BASIC)) == 0) {
-               netdev_info(bp->dev, "cannot form TC key: used_keys = 0x%x",
+               netdev_info(bp->dev, "cannot form TC key: used_keys = 0x%x\n",
                             dissector->used_keys);
                 return -EOPNOTSUPP;
         }
@@ -508,7 +508,7 @@ static int bnxt_hwrm_cfa_flow_free(struct bnxt *bp,
  
         rc = hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT);
         if (rc)
-               netdev_info(bp->dev, "%s: Error rc=%d", __func__, rc);
+               netdev_info(bp->dev, "%s: Error rc=%d\n", __func__, rc);
  
         return rc;
  }
@@ -841,7 +841,7 @@ static int hwrm_cfa_decap_filter_alloc(struct bnxt *bp,
                 resp = bnxt_get_hwrm_resp_addr(bp, &req);
                 *decap_filter_handle = resp->decap_filter_id;
         } else {
-               netdev_info(bp->dev, "%s: Error rc=%d", __func__, rc);
+               netdev_info(bp->dev, "%s: Error rc=%d\n", __func__, rc);
         }
         mutex_unlock(&bp->hwrm_cmd_lock);
  
@@ -859,7 +859,7 @@ static int hwrm_cfa_decap_filter_free(struct bnxt *bp,
  
         rc = hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT);
         if (rc)
-               netdev_info(bp->dev, "%s: Error rc=%d", __func__, rc);
+               netdev_info(bp->dev, "%s: Error rc=%d\n", __func__, rc);
  
         return rc;
  }
@@ -906,7 +906,7 @@ static int hwrm_cfa_encap_record_alloc(struct bnxt *bp,
                 resp = bnxt_get_hwrm_resp_addr(bp, &req);
                 *encap_record_handle = resp->encap_record_id;
         } else {
-               netdev_info(bp->dev, "%s: Error rc=%d", __func__, rc);
+               netdev_info(bp->dev, "%s: Error rc=%d\n", __func__, rc);
         }
         mutex_unlock(&bp->hwrm_cmd_lock);
  
@@ -924,7 +924,7 @@ static int hwrm_cfa_encap_record_free(struct bnxt *bp,
  
         rc = hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT);
         if (rc)
-               netdev_info(bp->dev, "%s: Error rc=%d", __func__, rc);
+               netdev_info(bp->dev, "%s: Error rc=%d\n", __func__, rc);
  
         return rc;
  }
@@ -943,7 +943,7 @@ static int bnxt_tc_put_l2_node(struct bnxt *bp,
                                              tc_info->l2_ht_params);
                 if (rc)
                         netdev_err(bp->dev,
-                                  "Error: %s: rhashtable_remove_fast: %d",
+                                  "Error: %s: rhashtable_remove_fast: %d\n",
                                    __func__, rc);
                 kfree_rcu(l2_node, rcu);
         }
@@ -972,7 +972,7 @@ bnxt_tc_get_l2_node(struct bnxt *bp, struct rhashtable *l2_table,
                 if (rc) {
                         kfree_rcu(l2_node, rcu);
                         netdev_err(bp->dev,
-                                  "Error: %s: rhashtable_insert_fast: %d",
+                                  "Error: %s: rhashtable_insert_fast: %d\n",
                                    __func__, rc);
                         return NULL;
                 }
@@ -1031,7 +1031,7 @@ static bool bnxt_tc_can_offload(struct bnxt *bp, struct bnxt_tc_flow *flow)
         if ((flow->flags & BNXT_TC_FLOW_FLAGS_PORTS) &&
             (flow->l4_key.ip_proto != IPPROTO_TCP &&
              flow->l4_key.ip_proto != IPPROTO_UDP)) {
-               netdev_info(bp->dev, "Cannot offload non-TCP/UDP (%d) ports",
+               netdev_info(bp->dev, "Cannot offload non-TCP/UDP (%d) ports\n",
                             flow->l4_key.ip_proto);
                 return false;
         }
@@ -1088,7 +1088,7 @@ static int bnxt_tc_put_tunnel_node(struct bnxt *bp,
                 rc =  rhashtable_remove_fast(tunnel_table, &tunnel_node->node,
                                              *ht_params);
                 if (rc) {
-                       netdev_err(bp->dev, "rhashtable_remove_fast rc=%d", rc);
+                       netdev_err(bp->dev, "rhashtable_remove_fast rc=%d\n", rc);
                         rc = -1;
                 }
                 kfree_rcu(tunnel_node, rcu);
@@ -1129,7 +1129,7 @@ bnxt_tc_get_tunnel_node(struct bnxt *bp, struct rhashtable *tunnel_table,
         tunnel_node->refcount++;
         return tunnel_node;
  err:
-       netdev_info(bp->dev, "error rc=%d", rc);
+       netdev_info(bp->dev, "error rc=%d\n", rc);
         return NULL;
  }
  
@@ -1187,7 +1187,7 @@ static void bnxt_tc_put_decap_l2_node(struct bnxt *bp,
                                              &decap_l2_node->node,
                                              tc_info->decap_l2_ht_params);
                 if (rc)
-                       netdev_err(bp->dev, "rhashtable_remove_fast rc=%d", rc);
+                       netdev_err(bp->dev, "rhashtable_remove_fast rc=%d\n", rc);
                 kfree_rcu(decap_l2_node, rcu);
         }
  }
@@ -1227,7 +1227,7 @@ static int bnxt_tc_resolve_tunnel_hdrs(struct bnxt *bp,
  
         rt = ip_route_output_key(dev_net(real_dst_dev), &flow);
         if (IS_ERR(rt)) {
-               netdev_info(bp->dev, "no route to %pI4b", &flow.daddr);
+               netdev_info(bp->dev, "no route to %pI4b\n", &flow.daddr);
                 return -EOPNOTSUPP;
         }
  
@@ -1241,7 +1241,7 @@ static int bnxt_tc_resolve_tunnel_hdrs(struct bnxt *bp,
  
                 if (vlan->real_dev != real_dst_dev) {
                         netdev_info(bp->dev,
-                                   "dst_dev(%s) doesn't use PF-if(%s)",
+                                   "dst_dev(%s) doesn't use PF-if(%s)\n",
                                     netdev_name(dst_dev),
                                     netdev_name(real_dst_dev));
                         rc = -EOPNOTSUPP;
@@ -1253,7 +1253,7 @@ static int bnxt_tc_resolve_tunnel_hdrs(struct bnxt *bp,
  #endif
         } else if (dst_dev != real_dst_dev) {
                 netdev_info(bp->dev,
-                           "dst_dev(%s) for %pI4b is not PF-if(%s)",
+                           "dst_dev(%s) for %pI4b is not PF-if(%s)\n",
                             netdev_name(dst_dev), &flow.daddr,
                             netdev_name(real_dst_dev));
                 rc = -EOPNOTSUPP;
@@ -1262,7 +1262,7 @@ static int bnxt_tc_resolve_tunnel_hdrs(struct bnxt *bp,
  
         nbr = dst_neigh_lookup(&rt->dst, &flow.daddr);
         if (!nbr) {
-               netdev_info(bp->dev, "can't lookup neighbor for %pI4b",
+               netdev_info(bp->dev, "can't lookup neighbor for %pI4b\n",
                             &flow.daddr);
                 rc = -EOPNOTSUPP;
                 goto put_rt;
@@ -1472,7 +1472,7 @@ static int __bnxt_tc_del_flow(struct bnxt *bp,
         rc = rhashtable_remove_fast(&tc_info->flow_table, &flow_node->node,
                                     tc_info->flow_ht_params);
         if (rc)
-               netdev_err(bp->dev, "Error: %s: rhashtable_remove_fast rc=%d",
+               netdev_err(bp->dev, "Error: %s: rhashtable_remove_fast rc=%d\n",
                            __func__, rc);
  
         kfree_rcu(flow_node, rcu);
@@ -1587,7 +1587,7 @@ unlock:
  free_node:
         kfree_rcu(new_node, rcu);
  done:
-       netdev_err(bp->dev, "Error: %s: cookie=0x%lx error=%d",
+       netdev_err(bp->dev, "Error: %s: cookie=0x%lx error=%d\n",
                    __func__, tc_flow_cmd->cookie, rc);
         return rc;
  }
@@ -1700,7 +1700,7 @@ bnxt_hwrm_cfa_flow_stats_get(struct bnxt *bp, int num_flows,
                                                 le64_to_cpu(resp_bytes[i]);
                 }
         } else {
-               netdev_info(bp->dev, "error rc=%d", rc);
+               netdev_info(bp->dev, "error rc=%d\n", rc);
         }
         mutex_unlock(&bp->hwrm_cmd_lock);
  
@@ -1970,7 +1970,7 @@ static int bnxt_tc_indr_block_event(struct notifier_block *nb,
                                                    bp);
                 if (rc)
                         netdev_info(bp->dev,
-                                   "Failed to register indirect blk: dev: %s",
+                                   "Failed to register indirect blk: dev: %s\n",
                                     netdev->name);
                 break;
         case NETDEV_UNREGISTER:
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c

index b010b34cdaf835fdf23eea5ffceaddce4386ec98..6f2faf81c1aead78822e81207c1213639971c248 100644 (file)
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c
@@ -43,7 +43,7 @@ static int hwrm_cfa_vfr_alloc(struct bnxt *bp, u16 vf_idx,
                 netdev_dbg(bp->dev, "tx_cfa_action=0x%x, rx_cfa_code=0x%x",
                            *tx_cfa_action, *rx_cfa_code);
         } else {
-               netdev_info(bp->dev, "%s error rc=%d", __func__, rc);
+               netdev_info(bp->dev, "%s error rc=%d\n", __func__, rc);
         }
  
         mutex_unlock(&bp->hwrm_cmd_lock);
@@ -60,7 +60,7 @@ static int hwrm_cfa_vfr_free(struct bnxt *bp, u16 vf_idx)
  
         rc = hwrm_send_message(bp, &req, sizeof(req), HWRM_CMD_TIMEOUT);
         if (rc)
-               netdev_info(bp->dev, "%s error rc=%d", __func__, rc);
+               netdev_info(bp->dev, "%s error rc=%d\n", __func__, rc);
         return rc;
  }
  
@@ -465,7 +465,7 @@ static int bnxt_vf_reps_create(struct bnxt *bp)
         return 0;
  
  err:
-       netdev_info(bp->dev, "%s error=%d", __func__, rc);
+       netdev_info(bp->dev, "%s error=%d\n", __func__, rc);
         kfree(cfa_code_map);
         __bnxt_vf_reps_destroy(bp);
         return rc;
@@ -488,7 +488,7 @@ int bnxt_dl_eswitch_mode_set(struct devlink *devlink, u16 mode,
  
         mutex_lock(&bp->sriov_lock);
         if (bp->eswitch_mode == mode) {
-               netdev_info(bp->dev, "already in %s eswitch mode",
+               netdev_info(bp->dev, "already in %s eswitch mode\n",
                             mode == DEVLINK_ESWITCH_MODE_LEGACY ?
                             "legacy" : "switchdev");
                 rc = -EINVAL;
@@ -508,7 +508,7 @@ int bnxt_dl_eswitch_mode_set(struct devlink *devlink, u16 mode,
                 }
  
                 if (pci_num_vf(bp->pdev) == 0) {
-                       netdev_info(bp->dev, "Enable VFs before setting switchdev mode");
+                       netdev_info(bp->dev, "Enable VFs before setting switchdev mode\n");
                         rc = -EPERM;
                         goto done;
                 }
diff --git a/drivers/net/ethernet/broadcom/cnic_defs.h b/drivers/net/ethernet/broadcom/cnic_defs.h

index b384997740717996485ed2ab6f8c882bba438d8d..99e2c6d4d8c3ad1cdd3f956851a99944789cb84c 100644 (file)
--- a/drivers/net/ethernet/broadcom/cnic_defs.h
+++ b/drivers/net/ethernet/broadcom/cnic_defs.h
@@ -543,13 +543,13 @@ struct l4_kwq_update_pg {
  #define L4_KWQ_UPDATE_PG_RESERVERD2_SHIFT 2
  #endif
  #if defined(__BIG_ENDIAN)
-       u16 reserverd3;
+       u16 reserved3;
         u8 da0;
         u8 da1;
  #elif defined(__LITTLE_ENDIAN)
         u8 da1;
         u8 da0;
-       u16 reserverd3;
+       u16 reserved3;
  #endif
  #if defined(__BIG_ENDIAN)
         u8 da2;
diff --git a/drivers/net/ethernet/broadcom/genet/bcmmii.c b/drivers/net/ethernet/broadcom/genet/bcmmii.c

index 6392a25301838f849241dc53448db5b476b224f8..10244941a7a604fc51f2d3182034b4a93e681ee2 100644 (file)
--- a/drivers/net/ethernet/broadcom/genet/bcmmii.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmmii.c
@@ -294,6 +294,7 @@ int bcmgenet_mii_config(struct net_device *dev, bool init)
          */
         if (priv->ext_phy) {
                 reg = bcmgenet_ext_readl(priv, EXT_RGMII_OOB_CTRL);
+               reg &= ~ID_MODE_DIS;
                 reg |= id_mode_dis;
                 if (GENET_IS_V1(priv) || GENET_IS_V2(priv) || GENET_IS_V3(priv))
                         reg |= RGMII_MODE_EN_V123;
diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h

index dbf7070fcdba26b66c54608d0e7f1bd673a3deae..a3f0f27fc79a1c3ebd6664afced137221eb257c0 100644 (file)
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -652,6 +652,7 @@
  #define MACB_CAPS_GEM_HAS_PTP                  0x00000040
  #define MACB_CAPS_BD_RD_PREFETCH               0x00000080
  #define MACB_CAPS_NEEDS_RSTONUBR               0x00000100
+#define MACB_CAPS_MACB_IS_EMAC                 0x08000000
  #define MACB_CAPS_FIFO_MODE                    0x10000000
  #define MACB_CAPS_GIGABIT_MODE_AVAILABLE       0x20000000
  #define MACB_CAPS_SG_DISABLED                  0x40000000
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c

index 4508f0d150da95d8e838d0ead806e8ef74794cc5..2c28da1737fe4ec2c0d1579d71bb454a3cabe289 100644 (file)
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -572,8 +572,21 @@ static void macb_mac_config(struct phylink_config *config, unsigned int mode,
         old_ctrl = ctrl = macb_or_gem_readl(bp, NCFGR);
  
         /* Clear all the bits we might set later */
-       ctrl &= ~(GEM_BIT(GBE) | MACB_BIT(SPD) | MACB_BIT(FD) | MACB_BIT(PAE) |
-                 GEM_BIT(SGMIIEN) | GEM_BIT(PCSSEL));
+       ctrl &= ~(MACB_BIT(SPD) | MACB_BIT(FD) | MACB_BIT(PAE));
+
+       if (bp->caps & MACB_CAPS_MACB_IS_EMAC) {
+               if (state->interface == PHY_INTERFACE_MODE_RMII)
+                       ctrl |= MACB_BIT(RM9200_RMII);
+       } else {
+               ctrl &= ~(GEM_BIT(GBE) | GEM_BIT(SGMIIEN) | GEM_BIT(PCSSEL));
+
+               /* We do not support MLO_PAUSE_RX yet */
+               if (state->pause & MLO_PAUSE_TX)
+                       ctrl |= MACB_BIT(PAE);
+
+               if (state->interface == PHY_INTERFACE_MODE_SGMII)
+                       ctrl |= GEM_BIT(SGMIIEN) | GEM_BIT(PCSSEL);
+       }
  
         if (state->speed == SPEED_1000)
                 ctrl |= GEM_BIT(GBE);
@@ -583,13 +596,6 @@ static void macb_mac_config(struct phylink_config *config, unsigned int mode,
         if (state->duplex)
                 ctrl |= MACB_BIT(FD);
  
-       /* We do not support MLO_PAUSE_RX yet */
-       if (state->pause & MLO_PAUSE_TX)
-               ctrl |= MACB_BIT(PAE);
-
-       if (state->interface == PHY_INTERFACE_MODE_SGMII)
-               ctrl |= GEM_BIT(SGMIIEN) | GEM_BIT(PCSSEL);
-
         /* Apply the new configuration, if any */
         if (old_ctrl ^ ctrl)
                 macb_or_gem_writel(bp, NCFGR, ctrl);
@@ -608,9 +614,10 @@ static void macb_mac_link_down(struct phylink_config *config, unsigned int mode,
         unsigned int q;
         u32 ctrl;
  
-       for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue)
-               queue_writel(queue, IDR,
-                            bp->rx_intr_mask | MACB_TX_INT_FLAGS | MACB_BIT(HRESP));
+       if (!(bp->caps & MACB_CAPS_MACB_IS_EMAC))
+               for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue)
+                       queue_writel(queue, IDR,
+                                    bp->rx_intr_mask | MACB_TX_INT_FLAGS | MACB_BIT(HRESP));
  
         /* Disable Rx and Tx */
         ctrl = macb_readl(bp, NCR) & ~(MACB_BIT(RE) | MACB_BIT(TE));
@@ -627,17 +634,19 @@ static void macb_mac_link_up(struct phylink_config *config, unsigned int mode,
         struct macb_queue *queue;
         unsigned int q;
  
-       macb_set_tx_clk(bp->tx_clk, bp->speed, ndev);
+       if (!(bp->caps & MACB_CAPS_MACB_IS_EMAC)) {
+               macb_set_tx_clk(bp->tx_clk, bp->speed, ndev);
  
-       /* Initialize rings & buffers as clearing MACB_BIT(TE) in link down
-        * cleared the pipeline and control registers.
-        */
-       bp->macbgem_ops.mog_init_rings(bp);
-       macb_init_buffers(bp);
+               /* Initialize rings & buffers as clearing MACB_BIT(TE) in link down
+                * cleared the pipeline and control registers.
+                */
+               bp->macbgem_ops.mog_init_rings(bp);
+               macb_init_buffers(bp);
  
-       for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue)
-               queue_writel(queue, IER,
-                            bp->rx_intr_mask | MACB_TX_INT_FLAGS | MACB_BIT(HRESP));
+               for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue)
+                       queue_writel(queue, IER,
+                                    bp->rx_intr_mask | MACB_TX_INT_FLAGS | MACB_BIT(HRESP));
+       }
  
         /* Enable Rx and Tx */
         macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(RE) | MACB_BIT(TE));
@@ -3790,6 +3799,10 @@ static int at91ether_open(struct net_device *dev)
         u32 ctl;
         int ret;
  
+       ret = pm_runtime_get_sync(&lp->pdev->dev);
+       if (ret < 0)
+               return ret;
+
         /* Clear internal statistics */
         ctl = macb_readl(lp, NCR);
         macb_writel(lp, NCR, ctl | MACB_BIT(CLRSTAT));
@@ -3854,7 +3867,7 @@ static int at91ether_close(struct net_device *dev)
                           q->rx_buffers, q->rx_buffers_dma);
         q->rx_buffers = NULL;
  
-       return 0;
+       return pm_runtime_put(&lp->pdev->dev);
  }
  
  /* Transmit packet */
@@ -4037,7 +4050,6 @@ static int at91ether_init(struct platform_device *pdev)
         struct net_device *dev = platform_get_drvdata(pdev);
         struct macb *bp = netdev_priv(dev);
         int err;
-       u32 reg;
  
         bp->queues[0].bp = bp;
  
@@ -4051,11 +4063,7 @@ static int at91ether_init(struct platform_device *pdev)
  
         macb_writel(bp, NCR, 0);
  
-       reg = MACB_BF(CLK, MACB_CLK_DIV32) | MACB_BIT(BIG);
-       if (bp->phy_interface == PHY_INTERFACE_MODE_RMII)
-               reg |= MACB_BIT(RM9200_RMII);
-
-       macb_writel(bp, NCFGR, reg);
+       macb_writel(bp, NCFGR, MACB_BF(CLK, MACB_CLK_DIV32) | MACB_BIT(BIG));
  
         return 0;
  }
@@ -4214,7 +4222,7 @@ static const struct macb_config sama5d4_config = {
  };
  
  static const struct macb_config emac_config = {
-       .caps = MACB_CAPS_NEEDS_RSTONUBR,
+       .caps = MACB_CAPS_NEEDS_RSTONUBR | MACB_CAPS_MACB_IS_EMAC,
         .clk_init = at91ether_clk_init,
         .init = at91ether_init,
  };
diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c

index 17a4110c2e4935d74e8f677ab0405fd73cec0593..8ff28ed04b7fcd0616760f91d727898d98f01f3e 100644 (file)
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.c
@@ -410,10 +410,19 @@ void bgx_lmac_rx_tx_enable(int node, int bgx_idx, int lmacid, bool enable)
         lmac = &bgx->lmac[lmacid];
  
         cfg = bgx_reg_read(bgx, lmacid, BGX_CMRX_CFG);
-       if (enable)
+       if (enable) {
                 cfg |= CMR_PKT_RX_EN | CMR_PKT_TX_EN;
-       else
+
+               /* enable TX FIFO Underflow interrupt */
+               bgx_reg_modify(bgx, lmacid, BGX_GMP_GMI_TXX_INT_ENA_W1S,
+                              GMI_TXX_INT_UNDFLW);
+       } else {
                 cfg &= ~(CMR_PKT_RX_EN | CMR_PKT_TX_EN);
+
+               /* Disable TX FIFO Underflow interrupt */
+               bgx_reg_modify(bgx, lmacid, BGX_GMP_GMI_TXX_INT_ENA_W1C,
+                              GMI_TXX_INT_UNDFLW);
+       }
         bgx_reg_write(bgx, lmacid, BGX_CMRX_CFG, cfg);
  
         if (bgx->is_rgx)
@@ -1535,6 +1544,48 @@ static int bgx_init_phy(struct bgx *bgx)
         return bgx_init_of_phy(bgx);
  }
  
+static irqreturn_t bgx_intr_handler(int irq, void *data)
+{
+       struct bgx *bgx = (struct bgx *)data;
+       u64 status, val;
+       int lmac;
+
+       for (lmac = 0; lmac < bgx->lmac_count; lmac++) {
+               status = bgx_reg_read(bgx, lmac, BGX_GMP_GMI_TXX_INT);
+               if (status & GMI_TXX_INT_UNDFLW) {
+                       pci_err(bgx->pdev, "BGX%d lmac%d UNDFLW\n",
+                               bgx->bgx_id, lmac);
+                       val = bgx_reg_read(bgx, lmac, BGX_CMRX_CFG);
+                       val &= ~CMR_EN;
+                       bgx_reg_write(bgx, lmac, BGX_CMRX_CFG, val);
+                       val |= CMR_EN;
+                       bgx_reg_write(bgx, lmac, BGX_CMRX_CFG, val);
+               }
+               /* clear interrupts */
+               bgx_reg_write(bgx, lmac, BGX_GMP_GMI_TXX_INT, status);
+       }
+
+       return IRQ_HANDLED;
+}
+
+static void bgx_register_intr(struct pci_dev *pdev)
+{
+       struct bgx *bgx = pci_get_drvdata(pdev);
+       int ret;
+
+       ret = pci_alloc_irq_vectors(pdev, BGX_LMAC_VEC_OFFSET,
+                                   BGX_LMAC_VEC_OFFSET, PCI_IRQ_ALL_TYPES);
+       if (ret < 0) {
+               pci_err(pdev, "Req for #%d msix vectors failed\n",
+                       BGX_LMAC_VEC_OFFSET);
+               return;
+       }
+       ret = pci_request_irq(pdev, GMPX_GMI_TX_INT, bgx_intr_handler, NULL,
+                             bgx, "BGX%d", bgx->bgx_id);
+       if (ret)
+               pci_free_irq(pdev, GMPX_GMI_TX_INT, bgx);
+}
+
  static int bgx_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
  {
         int err;
@@ -1550,7 +1601,7 @@ static int bgx_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
  
         pci_set_drvdata(pdev, bgx);
  
-       err = pci_enable_device(pdev);
+       err = pcim_enable_device(pdev);
         if (err) {
                 dev_err(dev, "Failed to enable PCI device\n");
                 pci_set_drvdata(pdev, NULL);
@@ -1604,6 +1655,8 @@ static int bgx_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
  
         bgx_init_hw(bgx);
  
+       bgx_register_intr(pdev);
+
         /* Enable all LMACs */
         for (lmac = 0; lmac < bgx->lmac_count; lmac++) {
                 err = bgx_lmac_enable(bgx, lmac);
@@ -1620,6 +1673,7 @@ static int bgx_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
  
  err_enable:
         bgx_vnic[bgx->bgx_id] = NULL;
+       pci_free_irq(pdev, GMPX_GMI_TX_INT, bgx);
  err_release_regions:
         pci_release_regions(pdev);
  err_disable_device:
@@ -1637,6 +1691,8 @@ static void bgx_remove(struct pci_dev *pdev)
         for (lmac = 0; lmac < bgx->lmac_count; lmac++)
                 bgx_lmac_disable(bgx, lmac);
  
+       pci_free_irq(pdev, GMPX_GMI_TX_INT, bgx);
+
         bgx_vnic[bgx->bgx_id] = NULL;
         pci_release_regions(pdev);
         pci_disable_device(pdev);
diff --git a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h

index 25888706bdcd1115e2824c88df23aff273b62171..cdea4939218578cd8424ad632702ce4e00104130 100644 (file)
--- a/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
+++ b/drivers/net/ethernet/cavium/thunder/thunder_bgx.h
@@ -180,6 +180,15 @@
  #define BGX_GMP_GMI_TXX_BURST          0x38228
  #define BGX_GMP_GMI_TXX_MIN_PKT                0x38240
  #define BGX_GMP_GMI_TXX_SGMII_CTL      0x38300
+#define BGX_GMP_GMI_TXX_INT            0x38500
+#define BGX_GMP_GMI_TXX_INT_W1S                0x38508
+#define BGX_GMP_GMI_TXX_INT_ENA_W1C    0x38510
+#define BGX_GMP_GMI_TXX_INT_ENA_W1S    0x38518
+#define  GMI_TXX_INT_PTP_LOST                  BIT_ULL(4)
+#define  GMI_TXX_INT_LATE_COL                  BIT_ULL(3)
+#define  GMI_TXX_INT_XSDEF                     BIT_ULL(2)
+#define  GMI_TXX_INT_XSCOL                     BIT_ULL(1)
+#define  GMI_TXX_INT_UNDFLW                    BIT_ULL(0)
  
  #define BGX_MSIX_VEC_0_29_ADDR         0x400000 /* +(0..29) << 4 */
  #define BGX_MSIX_VEC_0_29_CTL          0x400008
diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c

index bbd7b3175f09ed35094487c3b055f3c38e08a1ce..ddf60dc9ad167d913f9d2c454aacc2da1779e947 100644 (file)
--- a/drivers/net/ethernet/cisco/enic/enic_main.c
+++ b/drivers/net/ethernet/cisco/enic/enic_main.c
@@ -2013,10 +2013,10 @@ static int enic_stop(struct net_device *netdev)
                 napi_disable(&enic->napi[i]);
  
         netif_carrier_off(netdev);
-       netif_tx_disable(netdev);
         if (vnic_dev_get_intr_mode(enic->vdev) == VNIC_DEV_INTR_MODE_MSIX)
                 for (i = 0; i < enic->wq_count; i++)
                         napi_disable(&enic->napi[enic_cq_wq(enic, i)]);
+       netif_tx_disable(netdev);
  
         if (!enic_is_dynamic(enic) && !enic_is_sriov_vf(enic))
                 enic_dev_del_station_addr(enic);
diff --git a/drivers/net/ethernet/davicom/dm9000.c b/drivers/net/ethernet/davicom/dm9000.c

index 1ea3372775e6daa39c300e2624f921742cb5f29e..e94ae9b94dbfceea2c44065ad58d624676b5dd8f 100644 (file)
--- a/drivers/net/ethernet/davicom/dm9000.c
+++ b/drivers/net/ethernet/davicom/dm9000.c
@@ -1405,6 +1405,8 @@ static struct dm9000_plat_data *dm9000_parse_dt(struct device *dev)
         mac_addr = of_get_mac_address(np);
         if (!IS_ERR(mac_addr))
                 ether_addr_copy(pdata->dev_addr, mac_addr);
+       else if (PTR_ERR(mac_addr) == -EPROBE_DEFER)
+               return ERR_CAST(mac_addr);
  
         return pdata;
  }
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c

index ec5f6eeb639b68b0b94020be857647701ce02ea8..492bc944646372bea3db8212884e9d3914dd2a5e 100644 (file)
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c
@@ -6113,6 +6113,9 @@ static int hclge_get_all_rules(struct hnae3_handle *handle,
  static void hclge_fd_get_flow_tuples(const struct flow_keys *fkeys,
                                      struct hclge_fd_rule_tuples *tuples)
  {
+#define flow_ip6_src fkeys->addrs.v6addrs.src.in6_u.u6_addr32
+#define flow_ip6_dst fkeys->addrs.v6addrs.dst.in6_u.u6_addr32
+
         tuples->ether_proto = be16_to_cpu(fkeys->basic.n_proto);
         tuples->ip_proto = fkeys->basic.ip_proto;
         tuples->dst_port = be16_to_cpu(fkeys->ports.dst);
@@ -6121,12 +6124,12 @@ static void hclge_fd_get_flow_tuples(const struct flow_keys *fkeys,
                 tuples->src_ip[3] = be32_to_cpu(fkeys->addrs.v4addrs.src);
                 tuples->dst_ip[3] = be32_to_cpu(fkeys->addrs.v4addrs.dst);
         } else {
-               memcpy(tuples->src_ip,
-                      fkeys->addrs.v6addrs.src.in6_u.u6_addr32,
-                      sizeof(tuples->src_ip));
-               memcpy(tuples->dst_ip,
-                      fkeys->addrs.v6addrs.dst.in6_u.u6_addr32,
-                      sizeof(tuples->dst_ip));
+               int i;
+
+               for (i = 0; i < IPV6_SIZE; i++) {
+                       tuples->src_ip[i] = be32_to_cpu(flow_ip6_src[i]);
+                       tuples->dst_ip[i] = be32_to_cpu(flow_ip6_dst[i]);
+               }
         }
  }
  
@@ -9834,6 +9837,13 @@ static int hclge_reset_ae_dev(struct hnae3_ae_dev *ae_dev)
                 return ret;
         }
  
+       ret = init_mgr_tbl(hdev);
+       if (ret) {
+               dev_err(&pdev->dev,
+                       "failed to reinit manager table, ret = %d\n", ret);
+               return ret;
+       }
+
         ret = hclge_init_fd_config(hdev);
         if (ret) {
                 dev_err(&pdev->dev, "fd table init fail, ret=%d\n", ret);
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c

index 180224eab1ca4a46c34e9c62061c7087cd22bfb4..28db13253a5e762ae373cfeaf4cb9c5412dd4c2f 100644 (file)
--- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_tm.c
@@ -566,7 +566,7 @@ static void hclge_tm_vport_tc_info_update(struct hclge_vport *vport)
          */
         kinfo->num_tc = vport->vport_id ? 1 :
                         min_t(u16, vport->alloc_tqps, hdev->tm_info.num_tc);
-       vport->qs_offset = (vport->vport_id ? hdev->tm_info.num_tc : 0) +
+       vport->qs_offset = (vport->vport_id ? HNAE3_MAX_TC : 0) +
                                 (vport->vport_id ? (vport->vport_id - 1) : 0);
  
         max_rss_size = min_t(u16, hdev->rss_size_max,
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c b/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c

index 6f2cf569a283cb8f53ce653d7ab2c72fb2ca0799..79b3d53f2fbfa73d89ecd1f5490072da9503c0e9 100644 (file)
--- a/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c
@@ -297,6 +297,7 @@ static int set_hw_ioctxt(struct hinic_hwdev *hwdev, unsigned int rq_depth,
         }
  
         hw_ioctxt.func_idx = HINIC_HWIF_FUNC_IDX(hwif);
+       hw_ioctxt.ppf_idx = HINIC_HWIF_PPF_IDX(hwif);
  
         hw_ioctxt.set_cmdq_depth = HW_IOCTXT_SET_CMDQ_DEPTH_DEFAULT;
         hw_ioctxt.cmdq_depth = 0;
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.h b/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.h

index b069045de416c582660e06b5f98db3869e09478b..66fd2340d44795bc3bc56071f801e50c26d7e1ea 100644 (file)
--- a/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.h
+++ b/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.h
@@ -151,8 +151,8 @@ struct hinic_cmd_hw_ioctxt {
  
         u8      lro_en;
         u8      rsvd3;
+       u8      ppf_idx;
         u8      rsvd4;
-       u8      rsvd5;
  
         u16     rq_depth;
         u16     rx_buf_sz_idx;
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_hw_if.h b/drivers/net/ethernet/huawei/hinic/hinic_hw_if.h

index 517794509eb295cb0217e2df4ba43c1767ccaa9a..c7bb9ceca72cac90d85a6a11966c5771e7eee486 100644 (file)
--- a/drivers/net/ethernet/huawei/hinic/hinic_hw_if.h
+++ b/drivers/net/ethernet/huawei/hinic/hinic_hw_if.h
@@ -137,6 +137,7 @@
  #define HINIC_HWIF_FUNC_IDX(hwif)       ((hwif)->attr.func_idx)
  #define HINIC_HWIF_PCI_INTF(hwif)       ((hwif)->attr.pci_intf_idx)
  #define HINIC_HWIF_PF_IDX(hwif)         ((hwif)->attr.pf_idx)
+#define HINIC_HWIF_PPF_IDX(hwif)        ((hwif)->attr.ppf_idx)
  
  #define HINIC_FUNC_TYPE(hwif)           ((hwif)->attr.func_type)
  #define HINIC_IS_PF(hwif)               (HINIC_FUNC_TYPE(hwif) == HINIC_PF)
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_hw_qp.h b/drivers/net/ethernet/huawei/hinic/hinic_hw_qp.h

index f4a339b10b10b55c86da6080adf12b3248b58534..79091e1314181e3493c9e54760ca4671936337e1 100644 (file)
--- a/drivers/net/ethernet/huawei/hinic/hinic_hw_qp.h
+++ b/drivers/net/ethernet/huawei/hinic/hinic_hw_qp.h
@@ -94,6 +94,7 @@ struct hinic_rq {
  
         struct hinic_wq         *wq;
  
+       struct cpumask          affinity_mask;
         u32                     irq;
         u16                     msix_entry;
  
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_main.c b/drivers/net/ethernet/huawei/hinic/hinic_main.c

index 02a14f5e7fe31ddc28fe5a3720e081d5346db653..13560975c103a29438b4e00542561a4b796abb1f 100644 (file)
--- a/drivers/net/ethernet/huawei/hinic/hinic_main.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_main.c
@@ -356,7 +356,8 @@ static void hinic_enable_rss(struct hinic_dev *nic_dev)
         if (!num_cpus)
                 num_cpus = num_online_cpus();
  
-       nic_dev->num_qps = min_t(u16, nic_dev->max_qps, num_cpus);
+       nic_dev->num_qps = hinic_hwdev_num_qps(hwdev);
+       nic_dev->num_qps = min_t(u16, nic_dev->num_qps, num_cpus);
  
         nic_dev->rss_limit = nic_dev->num_qps;
         nic_dev->num_rss = nic_dev->num_qps;
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_rx.c b/drivers/net/ethernet/huawei/hinic/hinic_rx.c

index 56ea6d692f1c3dda7fd329c555d1a70696c0dd1a..2695ad69fca600c469762643ebd804f6bdfade1d 100644 (file)
--- a/drivers/net/ethernet/huawei/hinic/hinic_rx.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_rx.c
@@ -475,7 +475,6 @@ static int rx_request_irq(struct hinic_rxq *rxq)
         struct hinic_hwdev *hwdev = nic_dev->hwdev;
         struct hinic_rq *rq = rxq->rq;
         struct hinic_qp *qp;
-       struct cpumask mask;
         int err;
  
         rx_add_napi(rxq);
@@ -492,8 +491,8 @@ static int rx_request_irq(struct hinic_rxq *rxq)
         }
  
         qp = container_of(rq, struct hinic_qp, rq);
-       cpumask_set_cpu(qp->q_id % num_online_cpus(), &mask);
-       return irq_set_affinity_hint(rq->irq, &mask);
+       cpumask_set_cpu(qp->q_id % num_online_cpus(), &rq->affinity_mask);
+       return irq_set_affinity_hint(rq->irq, &rq->affinity_mask);
  }
  
  static void rx_free_irq(struct hinic_rxq *rxq)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c

index 69523ac85639ef2204223352e86c5bedb62a9fbb..56b9e445732ba500a88edcca938a1e658104a6fe 100644 (file)
--- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c
@@ -2362,7 +2362,7 @@ static int i40e_vc_enable_queues_msg(struct i40e_vf *vf, u8 *msg)
                 goto error_param;
         }
  
-       if (i40e_vc_validate_vqs_bitmaps(vqs)) {
+       if (!i40e_vc_validate_vqs_bitmaps(vqs)) {
                 aq_ret = I40E_ERR_PARAM;
                 goto error_param;
         }
@@ -2424,7 +2424,7 @@ static int i40e_vc_disable_queues_msg(struct i40e_vf *vf, u8 *msg)
                 goto error_param;
         }
  
-       if (i40e_vc_validate_vqs_bitmaps(vqs)) {
+       if (!i40e_vc_validate_vqs_bitmaps(vqs)) {
                 aq_ret = I40E_ERR_PARAM;
                 goto error_param;
         }
diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h

index 4459bc564b11f2bfd59831c6c881b417150cd6fe..6873998cf14547b44ef54aa926227d8f2ba7002c 100644 (file)
--- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
+++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h
@@ -1660,6 +1660,7 @@ struct ice_aqc_get_pkg_info_resp {
         __le32 count;
         struct ice_aqc_get_pkg_info pkg_info[1];
  };
+
  /**
   * struct ice_aq_desc - Admin Queue (AQ) descriptor
   * @flags: ICE_AQ_FLAG_* flags
diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c

index d8e975cceb211322dc71f7a419245a2707f0e877..81885efadc7a6472b82da6c42f1602de76526acb 100644 (file)
--- a/drivers/net/ethernet/intel/ice/ice_base.c
+++ b/drivers/net/ethernet/intel/ice/ice_base.c
@@ -324,7 +324,7 @@ int ice_setup_rx_ctx(struct ice_ring *ring)
                         if (err)
                                 return err;
  
-                       dev_info(&vsi->back->pdev->dev, "Registered XDP mem model MEM_TYPE_ZERO_COPY on Rx ring %d\n",
+                       dev_info(ice_pf_to_dev(vsi->back), "Registered XDP mem model MEM_TYPE_ZERO_COPY on Rx ring %d\n",
                                  ring->q_index);
                 } else {
                         ring->zca.free = NULL;
@@ -405,8 +405,7 @@ int ice_setup_rx_ctx(struct ice_ring *ring)
         /* Absolute queue number out of 2K needs to be passed */
         err = ice_write_rxq_ctx(hw, &rlan_ctx, pf_q);
         if (err) {
-               dev_err(&vsi->back->pdev->dev,
-                       "Failed to set LAN Rx queue context for absolute Rx queue %d error: %d\n",
+               dev_err(ice_pf_to_dev(vsi->back), "Failed to set LAN Rx queue context for absolute Rx queue %d error: %d\n",
                         pf_q, err);
                 return -EIO;
         }
@@ -428,8 +427,7 @@ int ice_setup_rx_ctx(struct ice_ring *ring)
               ice_alloc_rx_bufs_slow_zc(ring, ICE_DESC_UNUSED(ring)) :
               ice_alloc_rx_bufs(ring, ICE_DESC_UNUSED(ring));
         if (err)
-               dev_info(&vsi->back->pdev->dev,
-                        "Failed allocate some buffers on %sRx ring %d (pf_q %d)\n",
+               dev_info(ice_pf_to_dev(vsi->back), "Failed allocate some buffers on %sRx ring %d (pf_q %d)\n",
                          ring->xsk_umem ? "UMEM enabled " : "",
                          ring->q_index, pf_q);
  
@@ -490,8 +488,7 @@ int ice_vsi_ctrl_rx_ring(struct ice_vsi *vsi, bool ena, u16 rxq_idx)
         /* wait for the change to finish */
         ret = ice_pf_rxq_wait(pf, pf_q, ena);
         if (ret)
-               dev_err(ice_pf_to_dev(pf),
-                       "VSI idx %d Rx ring %d %sable timeout\n",
+               dev_err(ice_pf_to_dev(pf), "VSI idx %d Rx ring %d %sable timeout\n",
                         vsi->idx, pf_q, (ena ? "en" : "dis"));
  
         return ret;
@@ -506,20 +503,15 @@ int ice_vsi_ctrl_rx_ring(struct ice_vsi *vsi, bool ena, u16 rxq_idx)
   */
  int ice_vsi_alloc_q_vectors(struct ice_vsi *vsi)
  {
-       struct ice_pf *pf = vsi->back;
-       int v_idx = 0, num_q_vectors;
-       struct device *dev;
-       int err;
+       struct device *dev = ice_pf_to_dev(vsi->back);
+       int v_idx, err;
  
-       dev = ice_pf_to_dev(pf);
         if (vsi->q_vectors[0]) {
                 dev_dbg(dev, "VSI %d has existing q_vectors\n", vsi->vsi_num);
                 return -EEXIST;
         }
  
-       num_q_vectors = vsi->num_q_vectors;
-
-       for (v_idx = 0; v_idx < num_q_vectors; v_idx++) {
+       for (v_idx = 0; v_idx < vsi->num_q_vectors; v_idx++) {
                 err = ice_vsi_alloc_q_vector(vsi, v_idx);
                 if (err)
                         goto err_out;
@@ -648,8 +640,7 @@ ice_vsi_cfg_txq(struct ice_vsi *vsi, struct ice_ring *ring,
         status = ice_ena_vsi_txq(vsi->port_info, vsi->idx, tc, ring->q_handle,
                                  1, qg_buf, buf_len, NULL);
         if (status) {
-               dev_err(ice_pf_to_dev(pf),
-                       "Failed to set LAN Tx queue context, error: %d\n",
+               dev_err(ice_pf_to_dev(pf), "Failed to set LAN Tx queue context, error: %d\n",
                         status);
                 return -ENODEV;
         }
@@ -815,14 +806,12 @@ ice_vsi_stop_tx_ring(struct ice_vsi *vsi, enum ice_disq_rst_src rst_src,
          * queues at the hardware level anyway.
          */
         if (status == ICE_ERR_RESET_ONGOING) {
-               dev_dbg(&vsi->back->pdev->dev,
-                       "Reset in progress. LAN Tx queues already disabled\n");
+               dev_dbg(ice_pf_to_dev(vsi->back), "Reset in progress. LAN Tx queues already disabled\n");
         } else if (status == ICE_ERR_DOES_NOT_EXIST) {
-               dev_dbg(&vsi->back->pdev->dev,
-                       "LAN Tx queues do not exist, nothing to disable\n");
+               dev_dbg(ice_pf_to_dev(vsi->back), "LAN Tx queues do not exist, nothing to disable\n");
         } else if (status) {
-               dev_err(&vsi->back->pdev->dev,
-                       "Failed to disable LAN Tx queues, error: %d\n", status);
+               dev_err(ice_pf_to_dev(vsi->back), "Failed to disable LAN Tx queues, error: %d\n",
+                       status);
                 return -ENODEV;
         }
  
diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c

index 0207e28c26827c0b10f4fe3c290898cccc895286..04d5db0a25bfb521cabc435fcd489e1d8fce4d33 100644 (file)
--- a/drivers/net/ethernet/intel/ice/ice_common.c
+++ b/drivers/net/ethernet/intel/ice/ice_common.c
@@ -24,20 +24,6 @@ static enum ice_status ice_set_mac_type(struct ice_hw *hw)
         return 0;
  }
  
-/**
- * ice_dev_onetime_setup - Temporary HW/FW workarounds
- * @hw: pointer to the HW structure
- *
- * This function provides temporary workarounds for certain issues
- * that are expected to be fixed in the HW/FW.
- */
-void ice_dev_onetime_setup(struct ice_hw *hw)
-{
-#define MBX_PF_VT_PFALLOC      0x00231E80
-       /* set VFs per PF */
-       wr32(hw, MBX_PF_VT_PFALLOC, rd32(hw, PF_VT_PFALLOC_HIF));
-}
-
  /**
   * ice_clear_pf_cfg - Clear PF configuration
   * @hw: pointer to the hardware structure
@@ -602,10 +588,10 @@ void ice_output_fw_log(struct ice_hw *hw, struct ice_aq_desc *desc, void *buf)
  }
  
  /**
- * ice_get_itr_intrl_gran - determine int/intrl granularity
+ * ice_get_itr_intrl_gran
   * @hw: pointer to the HW struct
   *
- * Determines the ITR/intrl granularities based on the maximum aggregate
+ * Determines the ITR/INTRL granularities based on the maximum aggregate
   * bandwidth according to the device's configuration during power-on.
   */
  static void ice_get_itr_intrl_gran(struct ice_hw *hw)
@@ -763,8 +749,6 @@ enum ice_status ice_init_hw(struct ice_hw *hw)
         if (status)
                 goto err_unroll_sched;
  
-       ice_dev_onetime_setup(hw);
-
         /* Get MAC information */
         /* A single port can report up to two (LAN and WoL) addresses */
         mac_buf = devm_kcalloc(ice_hw_to_dev(hw), 2,
@@ -834,7 +818,7 @@ void ice_deinit_hw(struct ice_hw *hw)
   */
  enum ice_status ice_check_reset(struct ice_hw *hw)
  {
-       u32 cnt, reg = 0, grst_delay;
+       u32 cnt, reg = 0, grst_delay, uld_mask;
  
         /* Poll for Device Active state in case a recent CORER, GLOBR,
          * or EMPR has occurred. The grst delay value is in 100ms units.
@@ -856,13 +840,20 @@ enum ice_status ice_check_reset(struct ice_hw *hw)
                 return ICE_ERR_RESET_FAILED;
         }
  
-#define ICE_RESET_DONE_MASK    (GLNVM_ULD_CORER_DONE_M | \
-                                GLNVM_ULD_GLOBR_DONE_M)
+#define ICE_RESET_DONE_MASK    (GLNVM_ULD_PCIER_DONE_M |\
+                                GLNVM_ULD_PCIER_DONE_1_M |\
+                                GLNVM_ULD_CORER_DONE_M |\
+                                GLNVM_ULD_GLOBR_DONE_M |\
+                                GLNVM_ULD_POR_DONE_M |\
+                                GLNVM_ULD_POR_DONE_1_M |\
+                                GLNVM_ULD_PCIER_DONE_2_M)
+
+       uld_mask = ICE_RESET_DONE_MASK;
  
         /* Device is Active; check Global Reset processes are done */
         for (cnt = 0; cnt < ICE_PF_RESET_WAIT_COUNT; cnt++) {
-               reg = rd32(hw, GLNVM_ULD) & ICE_RESET_DONE_MASK;
-               if (reg == ICE_RESET_DONE_MASK) {
+               reg = rd32(hw, GLNVM_ULD) & uld_mask;
+               if (reg == uld_mask) {
                         ice_debug(hw, ICE_DBG_INIT,
                                   "Global reset processes done. %d\n", cnt);
                         break;
diff --git a/drivers/net/ethernet/intel/ice/ice_common.h b/drivers/net/ethernet/intel/ice/ice_common.h

index b5c013fdaaf972ac0de3264de38fbe01e0a96c22..f9fc005d35a78dfc58b284b1be4c9a45367a2a54 100644 (file)
--- a/drivers/net/ethernet/intel/ice/ice_common.h
+++ b/drivers/net/ethernet/intel/ice/ice_common.h
@@ -54,8 +54,6 @@ enum ice_status ice_get_caps(struct ice_hw *hw);
  
  void ice_set_safe_mode_caps(struct ice_hw *hw);
  
-void ice_dev_onetime_setup(struct ice_hw *hw);
-
  enum ice_status
  ice_write_rxq_ctx(struct ice_hw *hw, struct ice_rlan_ctx *rlan_ctx,
                   u32 rxq_index);
diff --git a/drivers/net/ethernet/intel/ice/ice_dcb.c b/drivers/net/ethernet/intel/ice/ice_dcb.c

index 713e8a892e149ec6578a889655ca71a61101fb3e..adb8dab765c8fae0a39f8c4360eedbadfb0e6715 100644 (file)
--- a/drivers/net/ethernet/intel/ice/ice_dcb.c
+++ b/drivers/net/ethernet/intel/ice/ice_dcb.c
@@ -1323,13 +1323,13 @@ enum ice_status ice_set_dcb_cfg(struct ice_port_info *pi)
  }
  
  /**
- * ice_aq_query_port_ets - query port ets configuration
+ * ice_aq_query_port_ets - query port ETS configuration
   * @pi: port information structure
   * @buf: pointer to buffer
   * @buf_size: buffer size in bytes
   * @cd: pointer to command details structure or NULL
   *
- * query current port ets configuration
+ * query current port ETS configuration
   */
  static enum ice_status
  ice_aq_query_port_ets(struct ice_port_info *pi,
@@ -1416,13 +1416,13 @@ ice_update_port_tc_tree_cfg(struct ice_port_info *pi,
  }
  
  /**
- * ice_query_port_ets - query port ets configuration
+ * ice_query_port_ets - query port ETS configuration
   * @pi: port information structure
   * @buf: pointer to buffer
   * @buf_size: buffer size in bytes
   * @cd: pointer to command details structure or NULL
   *
- * query current port ets configuration and update the
+ * query current port ETS configuration and update the
   * SW DB with the TC changes
   */
  enum ice_status
diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c

index 0664e5b8d130a1c2c51b12accd7f129a0e26971c..7108fb41b604296b24d7314467ff9147122324be 100644 (file)
--- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c
@@ -315,9 +315,9 @@ ice_dcb_need_recfg(struct ice_pf *pf, struct ice_dcbx_cfg *old_cfg,
   */
  void ice_dcb_rebuild(struct ice_pf *pf)
  {
-       struct ice_dcbx_cfg *local_dcbx_cfg, *desired_dcbx_cfg, *prev_cfg;
         struct ice_aqc_port_ets_elem buf = { 0 };
         struct device *dev = ice_pf_to_dev(pf);
+       struct ice_dcbx_cfg *err_cfg;
         enum ice_status ret;
  
         ret = ice_query_port_ets(pf->hw.port_info, &buf, sizeof(buf), NULL);
@@ -330,53 +330,25 @@ void ice_dcb_rebuild(struct ice_pf *pf)
         if (!test_bit(ICE_FLAG_DCB_ENA, pf->flags))
                 return;
  
-       local_dcbx_cfg = &pf->hw.port_info->local_dcbx_cfg;
-       desired_dcbx_cfg = &pf->hw.port_info->desired_dcbx_cfg;
+       mutex_lock(&pf->tc_mutex);
  
-       /* Save current willing state and force FW to unwilling */
-       local_dcbx_cfg->etscfg.willing = 0x0;
-       local_dcbx_cfg->pfc.willing = 0x0;
-       local_dcbx_cfg->app_mode = ICE_DCBX_APPS_NON_WILLING;
+       if (!pf->hw.port_info->is_sw_lldp)
+               ice_cfg_etsrec_defaults(pf->hw.port_info);
  
-       ice_cfg_etsrec_defaults(pf->hw.port_info);
         ret = ice_set_dcb_cfg(pf->hw.port_info);
         if (ret) {
-               dev_err(dev, "Failed to set DCB to unwilling\n");
+               dev_err(dev, "Failed to set DCB config in rebuild\n");
                 goto dcb_error;
         }
  
-       /* Retrieve DCB config and ensure same as current in SW */
-       prev_cfg = kmemdup(local_dcbx_cfg, sizeof(*prev_cfg), GFP_KERNEL);
-       if (!prev_cfg)
-               goto dcb_error;
-
-       ice_init_dcb(&pf->hw, true);
-       if (pf->hw.port_info->dcbx_status == ICE_DCBX_STATUS_DIS)
-               pf->hw.port_info->is_sw_lldp = true;
-       else
-               pf->hw.port_info->is_sw_lldp = false;
-
-       if (ice_dcb_need_recfg(pf, prev_cfg, local_dcbx_cfg)) {
-               /* difference in cfg detected - disable DCB till next MIB */
-               dev_err(dev, "Set local MIB not accurate\n");
-               kfree(prev_cfg);
-               goto dcb_error;
+       if (!pf->hw.port_info->is_sw_lldp) {
+               ret = ice_cfg_lldp_mib_change(&pf->hw, true);
+               if (ret && !pf->hw.port_info->is_sw_lldp) {
+                       dev_err(dev, "Failed to register for MIB changes\n");
+                       goto dcb_error;
+               }
         }
  
-       /* fetched config congruent to previous configuration */
-       kfree(prev_cfg);
-
-       /* Set the local desired config */
-       if (local_dcbx_cfg->dcbx_mode == ICE_DCBX_MODE_CEE)
-               memcpy(local_dcbx_cfg, desired_dcbx_cfg,
-                      sizeof(*local_dcbx_cfg));
-
-       ice_cfg_etsrec_defaults(pf->hw.port_info);
-       ret = ice_set_dcb_cfg(pf->hw.port_info);
-       if (ret) {
-               dev_err(dev, "Failed to set desired config\n");
-               goto dcb_error;
-       }
         dev_info(dev, "DCB restored after reset\n");
         ret = ice_query_port_ets(pf->hw.port_info, &buf, sizeof(buf), NULL);
         if (ret) {
@@ -384,26 +356,32 @@ void ice_dcb_rebuild(struct ice_pf *pf)
                 goto dcb_error;
         }
  
+       mutex_unlock(&pf->tc_mutex);
+
         return;
  
  dcb_error:
         dev_err(dev, "Disabling DCB until new settings occur\n");
-       prev_cfg = kzalloc(sizeof(*prev_cfg), GFP_KERNEL);
-       if (!prev_cfg)
+       err_cfg = kzalloc(sizeof(*err_cfg), GFP_KERNEL);
+       if (!err_cfg) {
+               mutex_unlock(&pf->tc_mutex);
                 return;
+       }
  
-       prev_cfg->etscfg.willing = true;
-       prev_cfg->etscfg.tcbwtable[0] = ICE_TC_MAX_BW;
-       prev_cfg->etscfg.tsatable[0] = ICE_IEEE_TSA_ETS;
-       memcpy(&prev_cfg->etsrec, &prev_cfg->etscfg, sizeof(prev_cfg->etsrec));
+       err_cfg->etscfg.willing = true;
+       err_cfg->etscfg.tcbwtable[0] = ICE_TC_MAX_BW;
+       err_cfg->etscfg.tsatable[0] = ICE_IEEE_TSA_ETS;
+       memcpy(&err_cfg->etsrec, &err_cfg->etscfg, sizeof(err_cfg->etsrec));
         /* Coverity warns the return code of ice_pf_dcb_cfg() is not checked
          * here as is done for other calls to that function. That check is
          * not necessary since this is in this function's error cleanup path.
          * Suppress the Coverity warning with the following comment...
          */
         /* coverity[check_return] */
-       ice_pf_dcb_cfg(pf, prev_cfg, false);
-       kfree(prev_cfg);
+       ice_pf_dcb_cfg(pf, err_cfg, false);
+       kfree(err_cfg);
+
+       mutex_unlock(&pf->tc_mutex);
  }
  
  /**
@@ -434,9 +412,9 @@ static int ice_dcb_init_cfg(struct ice_pf *pf, bool locked)
  }
  
  /**
- * ice_dcb_sw_default_config - Apply a default DCB config
+ * ice_dcb_sw_dflt_cfg - Apply a default DCB config
   * @pf: PF to apply config to
- * @ets_willing: configure ets willing
+ * @ets_willing: configure ETS willing
   * @locked: was this function called with RTNL held
   */
  static int ice_dcb_sw_dflt_cfg(struct ice_pf *pf, bool ets_willing, bool locked)
@@ -599,8 +577,7 @@ int ice_init_pf_dcb(struct ice_pf *pf, bool locked)
                 goto dcb_init_err;
         }
  
-       dev_info(dev,
-                "DCB is enabled in the hardware, max number of TCs supported on this port are %d\n",
+       dev_info(dev, "DCB is enabled in the hardware, max number of TCs supported on this port are %d\n",
                  pf->hw.func_caps.common_cap.maxtc);
         if (err) {
                 struct ice_vsi *pf_vsi;
@@ -610,8 +587,8 @@ int ice_init_pf_dcb(struct ice_pf *pf, bool locked)
                 clear_bit(ICE_FLAG_FW_LLDP_AGENT, pf->flags);
                 err = ice_dcb_sw_dflt_cfg(pf, true, locked);
                 if (err) {
-                       dev_err(dev,
-                               "Failed to set local DCB config %d\n", err);
+                       dev_err(dev, "Failed to set local DCB config %d\n",
+                               err);
                         err = -EIO;
                         goto dcb_init_err;
                 }
@@ -777,6 +754,8 @@ ice_dcb_process_lldp_set_mib_change(struct ice_pf *pf,
                 }
         }
  
+       mutex_lock(&pf->tc_mutex);
+
         /* store the old configuration */
         tmp_dcbx_cfg = pf->hw.port_info->local_dcbx_cfg;
  
@@ -787,20 +766,20 @@ ice_dcb_process_lldp_set_mib_change(struct ice_pf *pf,
         ret = ice_get_dcb_cfg(pf->hw.port_info);
         if (ret) {
                 dev_err(dev, "Failed to get DCB config\n");
-               return;
+               goto out;
         }
  
         /* No change detected in DCBX configs */
         if (!memcmp(&tmp_dcbx_cfg, &pi->local_dcbx_cfg, sizeof(tmp_dcbx_cfg))) {
                 dev_dbg(dev, "No change detected in DCBX configuration.\n");
-               return;
+               goto out;
         }
  
         need_reconfig = ice_dcb_need_recfg(pf, &tmp_dcbx_cfg,
                                            &pi->local_dcbx_cfg);
         ice_dcbnl_flush_apps(pf, &tmp_dcbx_cfg, &pi->local_dcbx_cfg);
         if (!need_reconfig)
-               return;
+               goto out;
  
         /* Enable DCB tagging only when more than one TC */
         if (ice_dcb_get_num_tc(&pi->local_dcbx_cfg) > 1) {
@@ -814,7 +793,7 @@ ice_dcb_process_lldp_set_mib_change(struct ice_pf *pf,
         pf_vsi = ice_get_main_vsi(pf);
         if (!pf_vsi) {
                 dev_dbg(dev, "PF VSI doesn't exist\n");
-               return;
+               goto out;
         }
  
         rtnl_lock();
@@ -823,13 +802,15 @@ ice_dcb_process_lldp_set_mib_change(struct ice_pf *pf,
         ret = ice_query_port_ets(pf->hw.port_info, &buf, sizeof(buf), NULL);
         if (ret) {
                 dev_err(dev, "Query Port ETS failed\n");
-               rtnl_unlock();
-               return;
+               goto unlock_rtnl;
         }
  
         /* changes in configuration update VSI */
         ice_pf_dcb_recfg(pf);
  
         ice_ena_vsi(pf_vsi, true);
+unlock_rtnl:
         rtnl_unlock();
+out:
+       mutex_unlock(&pf->tc_mutex);
  }
diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_nl.c b/drivers/net/ethernet/intel/ice/ice_dcb_nl.c

index d870c1aedc1709b79ba75a6183c80fed65a53222..b61aba428adbfa44ed2fc53f445da2a2bf4355fa 100644 (file)
--- a/drivers/net/ethernet/intel/ice/ice_dcb_nl.c
+++ b/drivers/net/ethernet/intel/ice/ice_dcb_nl.c
@@ -297,8 +297,7 @@ ice_dcbnl_get_pfc_cfg(struct net_device *netdev, int prio, u8 *setting)
                 return;
  
         *setting = (pi->local_dcbx_cfg.pfc.pfcena >> prio) & 0x1;
-       dev_dbg(ice_pf_to_dev(pf),
-               "Get PFC Config up=%d, setting=%d, pfcenable=0x%x\n",
+       dev_dbg(ice_pf_to_dev(pf), "Get PFC Config up=%d, setting=%d, pfcenable=0x%x\n",
                 prio, *setting, pi->local_dcbx_cfg.pfc.pfcena);
  }
  
@@ -418,8 +417,8 @@ ice_dcbnl_get_pg_tc_cfg_tx(struct net_device *netdev, int prio,
                 return;
  
         *pgid = pi->local_dcbx_cfg.etscfg.prio_table[prio];
-       dev_dbg(ice_pf_to_dev(pf),
-               "Get PG config prio=%d tc=%d\n", prio, *pgid);
+       dev_dbg(ice_pf_to_dev(pf), "Get PG config prio=%d tc=%d\n", prio,
+               *pgid);
  }
  
  /**
@@ -713,13 +712,13 @@ static int ice_dcbnl_delapp(struct net_device *netdev, struct dcb_app *app)
                 return -EINVAL;
  
         mutex_lock(&pf->tc_mutex);
-       ret = dcb_ieee_delapp(netdev, app);
-       if (ret)
-               goto delapp_out;
-
         old_cfg = &pf->hw.port_info->local_dcbx_cfg;
  
-       if (old_cfg->numapps == 1)
+       if (old_cfg->numapps <= 1)
+               goto delapp_out;
+
+       ret = dcb_ieee_delapp(netdev, app);
+       if (ret)
                 goto delapp_out;
  
         new_cfg = &pf->hw.port_info->desired_dcbx_cfg;
@@ -882,8 +881,7 @@ ice_dcbnl_vsi_del_app(struct ice_vsi *vsi,
         sapp.protocol = app->prot_id;
         sapp.priority = app->priority;
         err = ice_dcbnl_delapp(vsi->netdev, &sapp);
-       dev_dbg(&vsi->back->pdev->dev,
-               "Deleting app for VSI idx=%d err=%d sel=%d proto=0x%x, prio=%d\n",
+       dev_dbg(ice_pf_to_dev(vsi->back), "Deleting app for VSI idx=%d err=%d sel=%d proto=0x%x, prio=%d\n",
                 vsi->idx, err, app->selector, app->prot_id, app->priority);
  }
  
diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c

index 90c6a3ca20c99beb67024e9473773000b43132c5..77c412a7e7a47f4842dfd3c0ec96c3415aa39178 100644 (file)
--- a/drivers/net/ethernet/intel/ice/ice_ethtool.c
+++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c
@@ -166,13 +166,24 @@ static void
  ice_get_drvinfo(struct net_device *netdev, struct ethtool_drvinfo *drvinfo)
  {
         struct ice_netdev_priv *np = netdev_priv(netdev);
+       u8 oem_ver, oem_patch, nvm_ver_hi, nvm_ver_lo;
         struct ice_vsi *vsi = np->vsi;
         struct ice_pf *pf = vsi->back;
+       struct ice_hw *hw = &pf->hw;
+       u16 oem_build;
  
         strlcpy(drvinfo->driver, KBUILD_MODNAME, sizeof(drvinfo->driver));
         strlcpy(drvinfo->version, ice_drv_ver, sizeof(drvinfo->version));
-       strlcpy(drvinfo->fw_version, ice_nvm_version_str(&pf->hw),
-               sizeof(drvinfo->fw_version));
+
+       /* Display NVM version (from which the firmware version can be
+        * determined) which contains more pertinent information.
+        */
+       ice_get_nvm_version(hw, &oem_ver, &oem_build, &oem_patch,
+                           &nvm_ver_hi, &nvm_ver_lo);
+       snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
+                "%x.%02x 0x%x %d.%d.%d", nvm_ver_hi, nvm_ver_lo,
+                hw->nvm.eetrack, oem_ver, oem_build, oem_patch);
+
         strlcpy(drvinfo->bus_info, pci_name(pf->pdev),
                 sizeof(drvinfo->bus_info));
         drvinfo->n_priv_flags = ICE_PRIV_FLAG_ARRAY_SIZE;
@@ -363,8 +374,7 @@ static int ice_reg_pattern_test(struct ice_hw *hw, u32 reg, u32 mask)
                 val = rd32(hw, reg);
                 if (val == pattern)
                         continue;
-               dev_err(dev,
-                       "%s: reg pattern test failed - reg 0x%08x pat 0x%08x val 0x%08x\n"
+               dev_err(dev, "%s: reg pattern test failed - reg 0x%08x pat 0x%08x val 0x%08x\n"
                         , __func__, reg, pattern, val);
                 return 1;
         }
@@ -372,8 +382,7 @@ static int ice_reg_pattern_test(struct ice_hw *hw, u32 reg, u32 mask)
         wr32(hw, reg, orig_val);
         val = rd32(hw, reg);
         if (val != orig_val) {
-               dev_err(dev,
-                       "%s: reg restore test failed - reg 0x%08x orig 0x%08x val 0x%08x\n"
+               dev_err(dev, "%s: reg restore test failed - reg 0x%08x orig 0x%08x val 0x%08x\n"
                         , __func__, reg, orig_val, val);
                 return 1;
         }
@@ -791,8 +800,7 @@ ice_self_test(struct net_device *netdev, struct ethtool_test *eth_test,
                 set_bit(__ICE_TESTING, pf->state);
  
                 if (ice_active_vfs(pf)) {
-                       dev_warn(dev,
-                                "Please take active VFs and Netqueues offline and restart the adapter before running NIC diagnostics\n");
+                       dev_warn(dev, "Please take active VFs and Netqueues offline and restart the adapter before running NIC diagnostics\n");
                         data[ICE_ETH_TEST_REG] = 1;
                         data[ICE_ETH_TEST_EEPROM] = 1;
                         data[ICE_ETH_TEST_INTR] = 1;
@@ -1047,7 +1055,7 @@ ice_set_fecparam(struct net_device *netdev, struct ethtool_fecparam *fecparam)
                 fec = ICE_FEC_NONE;
                 break;
         default:
-               dev_warn(&vsi->back->pdev->dev, "Unsupported FEC mode: %d\n",
+               dev_warn(ice_pf_to_dev(vsi->back), "Unsupported FEC mode: %d\n",
                          fecparam->fec);
                 return -EINVAL;
         }
@@ -1200,8 +1208,7 @@ static int ice_set_priv_flags(struct net_device *netdev, u32 flags)
                          * events to respond to.
                          */
                         if (status)
-                               dev_info(dev,
-                                        "Failed to unreg for LLDP events\n");
+                               dev_info(dev, "Failed to unreg for LLDP events\n");
  
                         /* The AQ call to stop the FW LLDP agent will generate
                          * an error if the agent is already stopped.
@@ -1256,8 +1263,7 @@ static int ice_set_priv_flags(struct net_device *netdev, u32 flags)
                         /* Register for MIB change events */
                         status = ice_cfg_lldp_mib_change(&pf->hw, true);
                         if (status)
-                               dev_dbg(dev,
-                                       "Fail to enable MIB change events\n");
+                               dev_dbg(dev, "Fail to enable MIB change events\n");
                 }
         }
         if (test_bit(ICE_FLAG_LEGACY_RX, change_flags)) {
@@ -1710,291 +1716,13 @@ ice_get_settings_link_up(struct ethtool_link_ksettings *ks,
  {
         struct ice_netdev_priv *np = netdev_priv(netdev);
         struct ice_port_info *pi = np->vsi->port_info;
-       struct ethtool_link_ksettings cap_ksettings;
         struct ice_link_status *link_info;
         struct ice_vsi *vsi = np->vsi;
-       bool unrecog_phy_high = false;
-       bool unrecog_phy_low = false;
  
         link_info = &vsi->port_info->phy.link_info;
  
-       /* Initialize supported and advertised settings based on PHY settings */
-       switch (link_info->phy_type_low) {
-       case ICE_PHY_TYPE_LOW_100BASE_TX:
-               ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    100baseT_Full);
-               ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, advertising,
-                                                    100baseT_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_100M_SGMII:
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    100baseT_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_1000BASE_T:
-               ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    1000baseT_Full);
-               ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, advertising,
-                                                    1000baseT_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_1G_SGMII:
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    1000baseT_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_1000BASE_SX:
-       case ICE_PHY_TYPE_LOW_1000BASE_LX:
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    1000baseX_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_1000BASE_KX:
-               ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    1000baseKX_Full);
-               ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, advertising,
-                                                    1000baseKX_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_2500BASE_T:
-               ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    2500baseT_Full);
-               ethtool_link_ksettings_add_link_mode(ks, advertising,
-                                                    2500baseT_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_2500BASE_X:
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    2500baseX_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_2500BASE_KX:
-               ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    2500baseX_Full);
-               ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, advertising,
-                                                    2500baseX_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_5GBASE_T:
-       case ICE_PHY_TYPE_LOW_5GBASE_KR:
-               ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    5000baseT_Full);
-               ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, advertising,
-                                                    5000baseT_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_10GBASE_T:
-               ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    10000baseT_Full);
-               ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, advertising,
-                                                    10000baseT_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_10G_SFI_DA:
-       case ICE_PHY_TYPE_LOW_10G_SFI_AOC_ACC:
-       case ICE_PHY_TYPE_LOW_10G_SFI_C2C:
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    10000baseT_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_10GBASE_SR:
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    10000baseSR_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_10GBASE_LR:
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    10000baseLR_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_10GBASE_KR_CR1:
-               ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    10000baseKR_Full);
-               ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, advertising,
-                                                    10000baseKR_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_25GBASE_T:
-       case ICE_PHY_TYPE_LOW_25GBASE_CR:
-       case ICE_PHY_TYPE_LOW_25GBASE_CR_S:
-       case ICE_PHY_TYPE_LOW_25GBASE_CR1:
-               ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    25000baseCR_Full);
-               ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, advertising,
-                                                    25000baseCR_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_25G_AUI_AOC_ACC:
-       case ICE_PHY_TYPE_LOW_25G_AUI_C2C:
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    25000baseCR_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_25GBASE_SR:
-       case ICE_PHY_TYPE_LOW_25GBASE_LR:
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    25000baseSR_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_25GBASE_KR:
-       case ICE_PHY_TYPE_LOW_25GBASE_KR1:
-       case ICE_PHY_TYPE_LOW_25GBASE_KR_S:
-               ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    25000baseKR_Full);
-               ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, advertising,
-                                                    25000baseKR_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_40GBASE_CR4:
-               ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    40000baseCR4_Full);
-               ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, advertising,
-                                                    40000baseCR4_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_40G_XLAUI_AOC_ACC:
-       case ICE_PHY_TYPE_LOW_40G_XLAUI:
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    40000baseCR4_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_40GBASE_SR4:
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    40000baseSR4_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_40GBASE_LR4:
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    40000baseLR4_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_40GBASE_KR4:
-               ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    40000baseKR4_Full);
-               ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, advertising,
-                                                    40000baseKR4_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_50GBASE_CR2:
-       case ICE_PHY_TYPE_LOW_50GBASE_CP:
-               ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    50000baseCR2_Full);
-               ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, advertising,
-                                                    50000baseCR2_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_50G_LAUI2_AOC_ACC:
-       case ICE_PHY_TYPE_LOW_50G_LAUI2:
-       case ICE_PHY_TYPE_LOW_50G_AUI2_AOC_ACC:
-       case ICE_PHY_TYPE_LOW_50G_AUI2:
-       case ICE_PHY_TYPE_LOW_50GBASE_SR:
-       case ICE_PHY_TYPE_LOW_50G_AUI1_AOC_ACC:
-       case ICE_PHY_TYPE_LOW_50G_AUI1:
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    50000baseCR2_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_50GBASE_KR2:
-       case ICE_PHY_TYPE_LOW_50GBASE_KR_PAM4:
-               ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    50000baseKR2_Full);
-               ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, advertising,
-                                                    50000baseKR2_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_50GBASE_SR2:
-       case ICE_PHY_TYPE_LOW_50GBASE_LR2:
-       case ICE_PHY_TYPE_LOW_50GBASE_FR:
-       case ICE_PHY_TYPE_LOW_50GBASE_LR:
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    50000baseSR2_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_100GBASE_CR4:
-               ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    100000baseCR4_Full);
-               ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, advertising,
-                                                    100000baseCR4_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_100G_CAUI4_AOC_ACC:
-       case ICE_PHY_TYPE_LOW_100G_CAUI4:
-       case ICE_PHY_TYPE_LOW_100G_AUI4_AOC_ACC:
-       case ICE_PHY_TYPE_LOW_100G_AUI4:
-       case ICE_PHY_TYPE_LOW_100GBASE_CR_PAM4:
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    100000baseCR4_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_100GBASE_CP2:
-               ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    100000baseCR4_Full);
-               ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, advertising,
-                                                    100000baseCR4_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_100GBASE_SR4:
-       case ICE_PHY_TYPE_LOW_100GBASE_SR2:
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    100000baseSR4_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_100GBASE_LR4:
-       case ICE_PHY_TYPE_LOW_100GBASE_DR:
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    100000baseLR4_ER4_Full);
-               break;
-       case ICE_PHY_TYPE_LOW_100GBASE_KR4:
-       case ICE_PHY_TYPE_LOW_100GBASE_KR_PAM4:
-               ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    100000baseKR4_Full);
-               ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, advertising,
-                                                    100000baseKR4_Full);
-               break;
-       default:
-               unrecog_phy_low = true;
-       }
-
-       switch (link_info->phy_type_high) {
-       case ICE_PHY_TYPE_HIGH_100GBASE_KR2_PAM4:
-               ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    100000baseKR4_Full);
-               ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
-               ethtool_link_ksettings_add_link_mode(ks, advertising,
-                                                    100000baseKR4_Full);
-               break;
-       case ICE_PHY_TYPE_HIGH_100G_CAUI2_AOC_ACC:
-       case ICE_PHY_TYPE_HIGH_100G_CAUI2:
-       case ICE_PHY_TYPE_HIGH_100G_AUI2_AOC_ACC:
-       case ICE_PHY_TYPE_HIGH_100G_AUI2:
-               ethtool_link_ksettings_add_link_mode(ks, supported,
-                                                    100000baseCR4_Full);
-               break;
-       default:
-               unrecog_phy_high = true;
-       }
-
-       if (unrecog_phy_low && unrecog_phy_high) {
-               /* if we got here and link is up something bad is afoot */
-               netdev_info(netdev,
-                           "WARNING: Unrecognized PHY_Low (0x%llx).\n",
-                           (u64)link_info->phy_type_low);
-               netdev_info(netdev,
-                           "WARNING: Unrecognized PHY_High (0x%llx).\n",
-                           (u64)link_info->phy_type_high);
-       }
-
-       /* Now that we've worked out everything that could be supported by the
-        * current PHY type, get what is supported by the NVM and intersect
-        * them to get what is truly supported
-        */
-       memset(&cap_ksettings, 0, sizeof(cap_ksettings));
-       ice_phy_type_to_ethtool(netdev, &cap_ksettings);
-       ethtool_intersect_link_masks(ks, &cap_ksettings);
+       /* Get supported and advertised settings from PHY ability with media */
+       ice_phy_type_to_ethtool(netdev, ks);
  
         switch (link_info->link_speed) {
         case ICE_AQ_LINK_SPEED_100GB:
@@ -2028,8 +1756,7 @@ ice_get_settings_link_up(struct ethtool_link_ksettings *ks,
                 ks->base.speed = SPEED_100;
                 break;
         default:
-               netdev_info(netdev,
-                           "WARNING: Unrecognized link_speed (0x%x).\n",
+               netdev_info(netdev, "WARNING: Unrecognized link_speed (0x%x).\n",
                             link_info->link_speed);
                 break;
         }
@@ -2845,13 +2572,11 @@ ice_set_ringparam(struct net_device *netdev, struct ethtool_ringparam *ring)
  
         new_tx_cnt = ALIGN(ring->tx_pending, ICE_REQ_DESC_MULTIPLE);
         if (new_tx_cnt != ring->tx_pending)
-               netdev_info(netdev,
-                           "Requested Tx descriptor count rounded up to %d\n",
+               netdev_info(netdev, "Requested Tx descriptor count rounded up to %d\n",
                             new_tx_cnt);
         new_rx_cnt = ALIGN(ring->rx_pending, ICE_REQ_DESC_MULTIPLE);
         if (new_rx_cnt != ring->rx_pending)
-               netdev_info(netdev,
-                           "Requested Rx descriptor count rounded up to %d\n",
+               netdev_info(netdev, "Requested Rx descriptor count rounded up to %d\n",
                             new_rx_cnt);
  
         /* if nothing to do return success */
@@ -3211,13 +2936,6 @@ ice_set_pauseparam(struct net_device *netdev, struct ethtool_pauseparam *pause)
         else
                 return -EINVAL;
  
-       /* Tell the OS link is going down, the link will go back up when fw
-        * says it is ready asynchronously
-        */
-       ice_print_link_msg(vsi, false);
-       netif_carrier_off(netdev);
-       netif_tx_stop_all_queues(netdev);
-
         /* Set the FC mode and only restart AN if link is up */
         status = ice_set_fc(pi, &aq_failures, link_up);
  
@@ -3718,8 +3436,7 @@ ice_set_rc_coalesce(enum ice_container_type c_type, struct ethtool_coalesce *ec,
                 if (ec->rx_coalesce_usecs_high > ICE_MAX_INTRL ||
                     (ec->rx_coalesce_usecs_high &&
                      ec->rx_coalesce_usecs_high < pf->hw.intrl_gran)) {
-                       netdev_info(vsi->netdev,
-                                   "Invalid value, %s-usecs-high valid values are 0 (disabled), %d-%d\n",
+                       netdev_info(vsi->netdev, "Invalid value, %s-usecs-high valid values are 0 (disabled), %d-%d\n",
                                     c_type_str, pf->hw.intrl_gran,
                                     ICE_MAX_INTRL);
                         return -EINVAL;
@@ -3737,8 +3454,7 @@ ice_set_rc_coalesce(enum ice_container_type c_type, struct ethtool_coalesce *ec,
                 break;
         case ICE_TX_CONTAINER:
                 if (ec->tx_coalesce_usecs_high) {
-                       netdev_info(vsi->netdev,
-                                   "setting %s-usecs-high is not supported\n",
+                       netdev_info(vsi->netdev, "setting %s-usecs-high is not supported\n",
                                     c_type_str);
                         return -EINVAL;
                 }
@@ -3755,35 +3471,24 @@ ice_set_rc_coalesce(enum ice_container_type c_type, struct ethtool_coalesce *ec,
  
         itr_setting = rc->itr_setting & ~ICE_ITR_DYNAMIC;
         if (coalesce_usecs != itr_setting && use_adaptive_coalesce) {
-               netdev_info(vsi->netdev,
-                           "%s interrupt throttling cannot be changed if adaptive-%s is enabled\n",
+               netdev_info(vsi->netdev, "%s interrupt throttling cannot be changed if adaptive-%s is enabled\n",
                             c_type_str, c_type_str);
                 return -EINVAL;
         }
  
         if (coalesce_usecs > ICE_ITR_MAX) {
-               netdev_info(vsi->netdev,
-                           "Invalid value, %s-usecs range is 0-%d\n",
+               netdev_info(vsi->netdev, "Invalid value, %s-usecs range is 0-%d\n",
                             c_type_str, ICE_ITR_MAX);
                 return -EINVAL;
         }
  
-       /* hardware only supports an ITR granularity of 2us */
-       if (coalesce_usecs % 2 != 0) {
-               netdev_info(vsi->netdev,
-                           "Invalid value, %s-usecs must be even\n",
-                           c_type_str);
-               return -EINVAL;
-       }
-
         if (use_adaptive_coalesce) {
                 rc->itr_setting |= ICE_ITR_DYNAMIC;
         } else {
-               /* store user facing value how it was set */
+               /* save the user set usecs */
                 rc->itr_setting = coalesce_usecs;
-               /* set to static and convert to value HW understands */
-               rc->target_itr =
-                       ITR_TO_REG(ITR_REG_ALIGN(rc->itr_setting));
+               /* device ITR granularity is in 2 usec increments */
+               rc->target_itr = ITR_REG_ALIGN(rc->itr_setting);
         }
  
         return 0;
@@ -3876,6 +3581,30 @@ ice_is_coalesce_param_invalid(struct net_device *netdev,
         return 0;
  }
  
+/**
+ * ice_print_if_odd_usecs - print message if user tries to set odd [tx|rx]-usecs
+ * @netdev: netdev used for print
+ * @itr_setting: previous user setting
+ * @use_adaptive_coalesce: if adaptive coalesce is enabled or being enabled
+ * @coalesce_usecs: requested value of [tx|rx]-usecs
+ * @c_type_str: either "rx" or "tx" to match user set field of [tx|rx]-usecs
+ */
+static void
+ice_print_if_odd_usecs(struct net_device *netdev, u16 itr_setting,
+                      u32 use_adaptive_coalesce, u32 coalesce_usecs,
+                      const char *c_type_str)
+{
+       if (use_adaptive_coalesce)
+               return;
+
+       itr_setting = ITR_TO_REG(itr_setting);
+
+       if (itr_setting != coalesce_usecs && (coalesce_usecs % 2))
+               netdev_info(netdev, "User set %s-usecs to %d, device only supports even values. Rounding down and attempting to set %s-usecs to %d\n",
+                           c_type_str, coalesce_usecs, c_type_str,
+                           ITR_REG_ALIGN(coalesce_usecs));
+}
+
  /**
   * __ice_set_coalesce - set ITR/INTRL values for the device
   * @netdev: pointer to the netdev associated with this query
@@ -3896,8 +3625,19 @@ __ice_set_coalesce(struct net_device *netdev, struct ethtool_coalesce *ec,
                 return -EINVAL;
  
         if (q_num < 0) {
+               struct ice_q_vector *q_vector = vsi->q_vectors[0];
                 int v_idx;
  
+               if (q_vector) {
+                       ice_print_if_odd_usecs(netdev, q_vector->rx.itr_setting,
+                                              ec->use_adaptive_rx_coalesce,
+                                              ec->rx_coalesce_usecs, "rx");
+
+                       ice_print_if_odd_usecs(netdev, q_vector->tx.itr_setting,
+                                              ec->use_adaptive_tx_coalesce,
+                                              ec->tx_coalesce_usecs, "tx");
+               }
+
                 ice_for_each_q_vector(vsi, v_idx) {
                         /* In some cases if DCB is configured the num_[rx|tx]q
                          * can be less than vsi->num_q_vectors. This check
@@ -4012,8 +3752,7 @@ ice_get_module_info(struct net_device *netdev,
                 }
                 break;
         default:
-               netdev_warn(netdev,
-                           "SFF Module Type not recognized.\n");
+               netdev_warn(netdev, "SFF Module Type not recognized.\n");
                 return -EINVAL;
         }
         return 0;
@@ -4081,11 +3820,11 @@ ice_get_module_eeprom(struct net_device *netdev,
  static const struct ethtool_ops ice_ethtool_ops = {
         .get_link_ksettings     = ice_get_link_ksettings,
         .set_link_ksettings     = ice_set_link_ksettings,
-       .get_drvinfo            = ice_get_drvinfo,
-       .get_regs_len           = ice_get_regs_len,
-       .get_regs               = ice_get_regs,
-       .get_msglevel           = ice_get_msglevel,
-       .set_msglevel           = ice_set_msglevel,
+       .get_drvinfo            = ice_get_drvinfo,
+       .get_regs_len           = ice_get_regs_len,
+       .get_regs               = ice_get_regs,
+       .get_msglevel           = ice_get_msglevel,
+       .set_msglevel           = ice_set_msglevel,
         .self_test              = ice_self_test,
         .get_link               = ethtool_op_get_link,
         .get_eeprom_len         = ice_get_eeprom_len,
@@ -4112,8 +3851,8 @@ static const struct ethtool_ops ice_ethtool_ops = {
         .get_channels           = ice_get_channels,
         .set_channels           = ice_set_channels,
         .get_ts_info            = ethtool_op_get_ts_info,
-       .get_per_queue_coalesce = ice_get_per_q_coalesce,
-       .set_per_queue_coalesce = ice_set_per_q_coalesce,
+       .get_per_queue_coalesce = ice_get_per_q_coalesce,
+       .set_per_queue_coalesce = ice_set_per_q_coalesce,
         .get_fecparam           = ice_get_fecparam,
         .set_fecparam           = ice_set_fecparam,
         .get_module_info        = ice_get_module_info,
diff --git a/drivers/net/ethernet/intel/ice/ice_hw_autogen.h b/drivers/net/ethernet/intel/ice/ice_hw_autogen.h

index f2cababf256130a1de99d6f5900ba830426fbf75..6db3d0494127638098a7a4a2c02b8f15939d31a8 100644 (file)
--- a/drivers/net/ethernet/intel/ice/ice_hw_autogen.h
+++ b/drivers/net/ethernet/intel/ice/ice_hw_autogen.h
@@ -267,8 +267,14 @@
  #define GLNVM_GENS_SR_SIZE_S                   5
  #define GLNVM_GENS_SR_SIZE_M                   ICE_M(0x7, 5)
  #define GLNVM_ULD                              0x000B6008
+#define GLNVM_ULD_PCIER_DONE_M                 BIT(0)
+#define GLNVM_ULD_PCIER_DONE_1_M               BIT(1)
  #define GLNVM_ULD_CORER_DONE_M                 BIT(3)
  #define GLNVM_ULD_GLOBR_DONE_M                 BIT(4)
+#define GLNVM_ULD_POR_DONE_M                   BIT(5)
+#define GLNVM_ULD_POR_DONE_1_M                 BIT(8)
+#define GLNVM_ULD_PCIER_DONE_2_M               BIT(9)
+#define GLNVM_ULD_PE_DONE_M                    BIT(10)
  #define GLPCI_CNF2                             0x000BE004
  #define GLPCI_CNF2_CACHELINE_SIZE_M            BIT(1)
  #define PF_FUNC_RID                            0x0009E880
@@ -331,7 +337,6 @@
  #define GLV_TEPC(_VSI)                         (0x00312000 + ((_VSI) * 4))
  #define GLV_UPRCL(_i)                          (0x003B2000 + ((_i) * 8))
  #define GLV_UPTCL(_i)                          (0x0030A000 + ((_i) * 8))
-#define PF_VT_PFALLOC_HIF                      0x0009DD80
  #define VSIQF_HKEY_MAX_INDEX                   12
  #define VSIQF_HLUT_MAX_INDEX                   15
  #define VFINT_DYN_CTLN(_i)                     (0x00003800 + ((_i) * 4))
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c

index 1874c9f51a3223a6c2585b333f59528b38672c43..d974e2fa3e63816a3e35030fb2618abaea434294 100644 (file)
--- a/drivers/net/ethernet/intel/ice/ice_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_lib.c
@@ -117,8 +117,7 @@ static void ice_vsi_set_num_desc(struct ice_vsi *vsi)
                 vsi->num_tx_desc = ICE_DFLT_NUM_TX_DESC;
                 break;
         default:
-               dev_dbg(&vsi->back->pdev->dev,
-                       "Not setting number of Tx/Rx descriptors for VSI type %d\n",
+               dev_dbg(ice_pf_to_dev(vsi->back), "Not setting number of Tx/Rx descriptors for VSI type %d\n",
                         vsi->type);
                 break;
         }
@@ -724,7 +723,7 @@ static void ice_vsi_setup_q_map(struct ice_vsi *vsi, struct ice_vsi_ctx *ctxt)
         vsi->num_txq = tx_count;
  
         if (vsi->type == ICE_VSI_VF && vsi->num_txq != vsi->num_rxq) {
-               dev_dbg(&vsi->back->pdev->dev, "VF VSI should have same number of Tx and Rx queues. Hence making them equal\n");
+               dev_dbg(ice_pf_to_dev(vsi->back), "VF VSI should have same number of Tx and Rx queues. Hence making them equal\n");
                 /* since there is a chance that num_rxq could have been changed
                  * in the above for loop, make num_txq equal to num_rxq.
                  */
@@ -929,8 +928,7 @@ static int ice_vsi_setup_vector_base(struct ice_vsi *vsi)
         vsi->base_vector = ice_get_res(pf, pf->irq_tracker, num_q_vectors,
                                        vsi->idx);
         if (vsi->base_vector < 0) {
-               dev_err(dev,
-                       "Failed to get tracking for %d vectors for VSI %d, err=%d\n",
+               dev_err(dev, "Failed to get tracking for %d vectors for VSI %d, err=%d\n",
                         num_q_vectors, vsi->vsi_num, vsi->base_vector);
                 return -ENOENT;
         }
@@ -1232,8 +1230,9 @@ static void ice_vsi_set_rss_flow_fld(struct ice_vsi *vsi)
   *
   * Returns 0 on success or ENOMEM on failure.
   */
-int ice_add_mac_to_list(struct ice_vsi *vsi, struct list_head *add_list,
-                       const u8 *macaddr)
+int
+ice_add_mac_to_list(struct ice_vsi *vsi, struct list_head *add_list,
+                   const u8 *macaddr)
  {
         struct ice_fltr_list_entry *tmp;
         struct ice_pf *pf = vsi->back;
@@ -1392,12 +1391,10 @@ int ice_vsi_kill_vlan(struct ice_vsi *vsi, u16 vid)
  
         status = ice_remove_vlan(&pf->hw, &tmp_add_list);
         if (status == ICE_ERR_DOES_NOT_EXIST) {
-               dev_dbg(dev,
-                       "Failed to remove VLAN %d on VSI %i, it does not exist, status: %d\n",
+               dev_dbg(dev, "Failed to remove VLAN %d on VSI %i, it does not exist, status: %d\n",
                         vid, vsi->vsi_num, status);
         } else if (status) {
-               dev_err(dev,
-                       "Error removing VLAN %d on vsi %i error: %d\n",
+               dev_err(dev, "Error removing VLAN %d on vsi %i error: %d\n",
                         vid, vsi->vsi_num, status);
                 err = -EIO;
         }
@@ -1453,8 +1450,7 @@ setup_rings:
  
                 err = ice_setup_rx_ctx(vsi->rx_rings[i]);
                 if (err) {
-                       dev_err(&vsi->back->pdev->dev,
-                               "ice_setup_rx_ctx failed for RxQ %d, err %d\n",
+                       dev_err(ice_pf_to_dev(vsi->back), "ice_setup_rx_ctx failed for RxQ %d, err %d\n",
                                 i, err);
                         return err;
                 }
@@ -1623,7 +1619,7 @@ int ice_vsi_manage_vlan_insertion(struct ice_vsi *vsi)
  
         status = ice_update_vsi(hw, vsi->idx, ctxt, NULL);
         if (status) {
-               dev_err(&vsi->back->pdev->dev, "update VSI for VLAN insert failed, err %d aq_err %d\n",
+               dev_err(ice_pf_to_dev(vsi->back), "update VSI for VLAN insert failed, err %d aq_err %d\n",
                         status, hw->adminq.sq_last_status);
                 ret = -EIO;
                 goto out;
@@ -1669,7 +1665,7 @@ int ice_vsi_manage_vlan_stripping(struct ice_vsi *vsi, bool ena)
  
         status = ice_update_vsi(hw, vsi->idx, ctxt, NULL);
         if (status) {
-               dev_err(&vsi->back->pdev->dev, "update VSI for VLAN strip failed, ena = %d err %d aq_err %d\n",
+               dev_err(ice_pf_to_dev(vsi->back), "update VSI for VLAN strip failed, ena = %d err %d aq_err %d\n",
                         ena, status, hw->adminq.sq_last_status);
                 ret = -EIO;
                 goto out;
@@ -1834,8 +1830,7 @@ ice_vsi_set_q_vectors_reg_idx(struct ice_vsi *vsi)
                 struct ice_q_vector *q_vector = vsi->q_vectors[i];
  
                 if (!q_vector) {
-                       dev_err(&vsi->back->pdev->dev,
-                               "Failed to set reg_idx on q_vector %d VSI %d\n",
+                       dev_err(ice_pf_to_dev(vsi->back), "Failed to set reg_idx on q_vector %d VSI %d\n",
                                 i, vsi->vsi_num);
                         goto clear_reg_idx;
                 }
@@ -1898,8 +1893,7 @@ ice_vsi_add_rem_eth_mac(struct ice_vsi *vsi, bool add_rule)
                 status = ice_remove_eth_mac(&pf->hw, &tmp_add_list);
  
         if (status)
-               dev_err(dev,
-                       "Failure Adding or Removing Ethertype on VSI %i error: %d\n",
+               dev_err(dev, "Failure Adding or Removing Ethertype on VSI %i error: %d\n",
                         vsi->vsi_num, status);
  
         ice_free_fltr_list(dev, &tmp_add_list);
@@ -2384,8 +2378,7 @@ ice_get_res(struct ice_pf *pf, struct ice_res_tracker *res, u16 needed, u16 id)
                 return -EINVAL;
  
         if (!needed || needed > res->num_entries || id >= ICE_RES_VALID_BIT) {
-               dev_err(ice_pf_to_dev(pf),
-                       "param err: needed=%d, num_entries = %d id=0x%04x\n",
+               dev_err(ice_pf_to_dev(pf), "param err: needed=%d, num_entries = %d id=0x%04x\n",
                         needed, res->num_entries, id);
                 return -EINVAL;
         }
@@ -2686,7 +2679,6 @@ int ice_vsi_rebuild(struct ice_vsi *vsi, bool init_vsi)
         ice_vsi_put_qs(vsi);
         ice_vsi_clear_rings(vsi);
         ice_vsi_free_arrays(vsi);
-       ice_dev_onetime_setup(&pf->hw);
         if (vsi->type == ICE_VSI_VF)
                 ice_vsi_set_num_qs(vsi, vf->vf_id);
         else
@@ -2765,8 +2757,7 @@ int ice_vsi_rebuild(struct ice_vsi *vsi, bool init_vsi)
         status = ice_cfg_vsi_lan(vsi->port_info, vsi->idx, vsi->tc_cfg.ena_tc,
                                  max_txqs);
         if (status) {
-               dev_err(ice_pf_to_dev(pf),
-                       "VSI %d failed lan queue config, error %d\n",
+               dev_err(ice_pf_to_dev(pf), "VSI %d failed lan queue config, error %d\n",
                         vsi->vsi_num, status);
                 if (init_vsi) {
                         ret = -EIO;
@@ -2834,8 +2825,8 @@ static void ice_vsi_update_q_map(struct ice_vsi *vsi, struct ice_vsi_ctx *ctx)
  int ice_vsi_cfg_tc(struct ice_vsi *vsi, u8 ena_tc)
  {
         u16 max_txqs[ICE_MAX_TRAFFIC_CLASS] = { 0 };
-       struct ice_vsi_ctx *ctx;
         struct ice_pf *pf = vsi->back;
+       struct ice_vsi_ctx *ctx;
         enum ice_status status;
         struct device *dev;
         int i, ret = 0;
@@ -2891,25 +2882,6 @@ out:
  }
  #endif /* CONFIG_DCB */
  
-/**
- * ice_nvm_version_str - format the NVM version strings
- * @hw: ptr to the hardware info
- */
-char *ice_nvm_version_str(struct ice_hw *hw)
-{
-       u8 oem_ver, oem_patch, ver_hi, ver_lo;
-       static char buf[ICE_NVM_VER_LEN];
-       u16 oem_build;
-
-       ice_get_nvm_version(hw, &oem_ver, &oem_build, &oem_patch, &ver_hi,
-                           &ver_lo);
-
-       snprintf(buf, sizeof(buf), "%x.%02x 0x%x %d.%d.%d", ver_hi, ver_lo,
-                hw->nvm.eetrack, oem_ver, oem_build, oem_patch);
-
-       return buf;
-}
-
  /**
   * ice_update_ring_stats - Update ring statistics
   * @ring: ring to update
@@ -2981,7 +2953,7 @@ ice_vsi_cfg_mac_fltr(struct ice_vsi *vsi, const u8 *macaddr, bool set)
                 status = ice_remove_mac(&vsi->back->hw, &tmp_add_list);
  
  cfg_mac_fltr_exit:
-       ice_free_fltr_list(&vsi->back->pdev->dev, &tmp_add_list);
+       ice_free_fltr_list(ice_pf_to_dev(vsi->back), &tmp_add_list);
         return status;
  }
  
@@ -3043,16 +3015,14 @@ int ice_set_dflt_vsi(struct ice_sw *sw, struct ice_vsi *vsi)
  
         /* another VSI is already the default VSI for this switch */
         if (ice_is_dflt_vsi_in_use(sw)) {
-               dev_err(dev,
-                       "Default forwarding VSI %d already in use, disable it and try again\n",
+               dev_err(dev, "Default forwarding VSI %d already in use, disable it and try again\n",
                         sw->dflt_vsi->vsi_num);
                 return -EEXIST;
         }
  
         status = ice_cfg_dflt_vsi(&vsi->back->hw, vsi->idx, true, ICE_FLTR_RX);
         if (status) {
-               dev_err(dev,
-                       "Failed to set VSI %d as the default forwarding VSI, error %d\n",
+               dev_err(dev, "Failed to set VSI %d as the default forwarding VSI, error %d\n",
                         vsi->vsi_num, status);
                 return -EIO;
         }
@@ -3091,8 +3061,7 @@ int ice_clear_dflt_vsi(struct ice_sw *sw)
         status = ice_cfg_dflt_vsi(&dflt_vsi->back->hw, dflt_vsi->idx, false,
                                   ICE_FLTR_RX);
         if (status) {
-               dev_err(dev,
-                       "Failed to clear the default forwarding VSI %d, error %d\n",
+               dev_err(dev, "Failed to clear the default forwarding VSI %d, error %d\n",
                         dflt_vsi->vsi_num, status);
                 return -EIO;
         }
diff --git a/drivers/net/ethernet/intel/ice/ice_lib.h b/drivers/net/ethernet/intel/ice/ice_lib.h

index 68fd0d4505c26c6b200e1cb6ed3ac8625028611d..e2c0dadce9204d8310128198c0d41f461a10a159 100644 (file)
--- a/drivers/net/ethernet/intel/ice/ice_lib.h
+++ b/drivers/net/ethernet/intel/ice/ice_lib.h
@@ -97,8 +97,6 @@ void ice_vsi_cfg_frame_size(struct ice_vsi *vsi);
  
  u32 ice_intrl_usec_to_reg(u8 intrl, u8 gran);
  
-char *ice_nvm_version_str(struct ice_hw *hw);
-
  enum ice_status
  ice_vsi_cfg_mac_fltr(struct ice_vsi *vsi, const u8 *macaddr, bool set);
  
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c

index 5ae671609f98a0ac532ef5f967d7ab0b0ca8a203..5ef28052c0f8be19636a93b4a7ba7b0fca2bdb65 100644 (file)
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -162,8 +162,7 @@ unregister:
          * had an error
          */
         if (status && vsi->netdev->reg_state == NETREG_REGISTERED) {
-               dev_err(ice_pf_to_dev(pf),
-                       "Could not add MAC filters error %d. Unregistering device\n",
+               dev_err(ice_pf_to_dev(pf), "Could not add MAC filters error %d. Unregistering device\n",
                         status);
                 unregister_netdev(vsi->netdev);
                 free_netdev(vsi->netdev);
@@ -269,7 +268,7 @@ static int ice_cfg_promisc(struct ice_vsi *vsi, u8 promisc_m, bool set_promisc)
   */
  static int ice_vsi_sync_fltr(struct ice_vsi *vsi)
  {
-       struct device *dev = &vsi->back->pdev->dev;
+       struct device *dev = ice_pf_to_dev(vsi->back);
         struct net_device *netdev = vsi->netdev;
         bool promisc_forced_on = false;
         struct ice_pf *pf = vsi->back;
@@ -335,8 +334,7 @@ static int ice_vsi_sync_fltr(struct ice_vsi *vsi)
                     !test_and_set_bit(__ICE_FLTR_OVERFLOW_PROMISC,
                                       vsi->state)) {
                         promisc_forced_on = true;
-                       netdev_warn(netdev,
-                                   "Reached MAC filter limit, forcing promisc mode on VSI %d\n",
+                       netdev_warn(netdev, "Reached MAC filter limit, forcing promisc mode on VSI %d\n",
                                     vsi->vsi_num);
                 } else {
                         err = -EIO;
@@ -382,8 +380,7 @@ static int ice_vsi_sync_fltr(struct ice_vsi *vsi)
                         if (!ice_is_dflt_vsi_in_use(pf->first_sw)) {
                                 err = ice_set_dflt_vsi(pf->first_sw, vsi);
                                 if (err && err != -EEXIST) {
-                                       netdev_err(netdev,
-                                                  "Error %d setting default VSI %i Rx rule\n",
+                                       netdev_err(netdev, "Error %d setting default VSI %i Rx rule\n",
                                                    err, vsi->vsi_num);
                                         vsi->current_netdev_flags &=
                                                 ~IFF_PROMISC;
@@ -395,8 +392,7 @@ static int ice_vsi_sync_fltr(struct ice_vsi *vsi)
                         if (ice_is_vsi_dflt_vsi(pf->first_sw, vsi)) {
                                 err = ice_clear_dflt_vsi(pf->first_sw);
                                 if (err) {
-                                       netdev_err(netdev,
-                                                  "Error %d clearing default VSI %i Rx rule\n",
+                                       netdev_err(netdev, "Error %d clearing default VSI %i Rx rule\n",
                                                    err, vsi->vsi_num);
                                         vsi->current_netdev_flags |=
                                                 IFF_PROMISC;
@@ -752,7 +748,7 @@ void ice_print_link_msg(struct ice_vsi *vsi, bool isup)
         kfree(caps);
  
  done:
-       netdev_info(vsi->netdev, "NIC Link is up %sbps, Requested FEC: %s, FEC: %s, Autoneg: %s, Flow Control: %s\n",
+       netdev_info(vsi->netdev, "NIC Link is up %sbps Full Duplex, Requested FEC: %s, Negotiated FEC: %s, Autoneg: %s, Flow Control: %s\n",
                     speed, fec_req, fec, an, fc);
         ice_print_topo_conflict(vsi);
  }
@@ -815,8 +811,7 @@ ice_link_event(struct ice_pf *pf, struct ice_port_info *pi, bool link_up,
          */
         result = ice_update_link_info(pi);
         if (result)
-               dev_dbg(dev,
-                       "Failed to update link status and re-enable link events for port %d\n",
+               dev_dbg(dev, "Failed to update link status and re-enable link events for port %d\n",
                         pi->lport);
  
         /* if the old link up/down and speed is the same as the new */
@@ -834,13 +829,13 @@ ice_link_event(struct ice_pf *pf, struct ice_port_info *pi, bool link_up,
  
                 result = ice_aq_set_link_restart_an(pi, false, NULL);
                 if (result) {
-                       dev_dbg(dev,
-                               "Failed to set link down, VSI %d error %d\n",
+                       dev_dbg(dev, "Failed to set link down, VSI %d error %d\n",
                                 vsi->vsi_num, result);
                         return result;
                 }
         }
  
+       ice_dcb_rebuild(pf);
         ice_vsi_link_event(vsi, link_up);
         ice_print_link_msg(vsi, link_up);
  
@@ -892,15 +887,13 @@ static int ice_init_link_events(struct ice_port_info *pi)
                        ICE_AQ_LINK_EVENT_MODULE_QUAL_FAIL));
  
         if (ice_aq_set_event_mask(pi->hw, pi->lport, mask, NULL)) {
-               dev_dbg(ice_hw_to_dev(pi->hw),
-                       "Failed to set link event mask for port %d\n",
+               dev_dbg(ice_hw_to_dev(pi->hw), "Failed to set link event mask for port %d\n",
                         pi->lport);
                 return -EIO;
         }
  
         if (ice_aq_get_link_info(pi, true, NULL, NULL)) {
-               dev_dbg(ice_hw_to_dev(pi->hw),
-                       "Failed to enable link events for port %d\n",
+               dev_dbg(ice_hw_to_dev(pi->hw), "Failed to enable link events for port %d\n",
                         pi->lport);
                 return -EIO;
         }
@@ -929,8 +922,8 @@ ice_handle_link_event(struct ice_pf *pf, struct ice_rq_event_info *event)
                                 !!(link_data->link_info & ICE_AQ_LINK_UP),
                                 le16_to_cpu(link_data->link_speed));
         if (status)
-               dev_dbg(ice_pf_to_dev(pf),
-                       "Could not process link event, error %d\n", status);
+               dev_dbg(ice_pf_to_dev(pf), "Could not process link event, error %d\n",
+                       status);
  
         return status;
  }
@@ -979,13 +972,11 @@ static int __ice_clean_ctrlq(struct ice_pf *pf, enum ice_ctl_q q_type)
                         dev_dbg(dev, "%s Receive Queue VF Error detected\n",
                                 qtype);
                 if (val & PF_FW_ARQLEN_ARQOVFL_M) {
-                       dev_dbg(dev,
-                               "%s Receive Queue Overflow Error detected\n",
+                       dev_dbg(dev, "%s Receive Queue Overflow Error detected\n",
                                 qtype);
                 }
                 if (val & PF_FW_ARQLEN_ARQCRIT_M)
-                       dev_dbg(dev,
-                               "%s Receive Queue Critical Error detected\n",
+                       dev_dbg(dev, "%s Receive Queue Critical Error detected\n",
                                 qtype);
                 val &= ~(PF_FW_ARQLEN_ARQVFE_M | PF_FW_ARQLEN_ARQOVFL_M |
                          PF_FW_ARQLEN_ARQCRIT_M);
@@ -998,8 +989,8 @@ static int __ice_clean_ctrlq(struct ice_pf *pf, enum ice_ctl_q q_type)
                    PF_FW_ATQLEN_ATQCRIT_M)) {
                 oldval = val;
                 if (val & PF_FW_ATQLEN_ATQVFE_M)
-                       dev_dbg(dev,
-                               "%s Send Queue VF Error detected\n", qtype);
+                       dev_dbg(dev, "%s Send Queue VF Error detected\n",
+                               qtype);
                 if (val & PF_FW_ATQLEN_ATQOVFL_M) {
                         dev_dbg(dev, "%s Send Queue Overflow Error detected\n",
                                 qtype);
@@ -1048,8 +1039,7 @@ static int __ice_clean_ctrlq(struct ice_pf *pf, enum ice_ctl_q q_type)
                         ice_dcb_process_lldp_set_mib_change(pf, &event);
                         break;
                 default:
-                       dev_dbg(dev,
-                               "%s Receive Queue unknown event 0x%04x ignored\n",
+                       dev_dbg(dev, "%s Receive Queue unknown event 0x%04x ignored\n",
                                 qtype, opcode);
                         break;
                 }
@@ -1238,7 +1228,7 @@ static void ice_handle_mdd_event(struct ice_pf *pf)
                 u16 queue = ((reg & GL_MDET_TX_TCLAN_QNUM_M) >>
                                 GL_MDET_TX_TCLAN_QNUM_S);
  
-               if (netif_msg_rx_err(pf))
+               if (netif_msg_tx_err(pf))
                         dev_info(dev, "Malicious Driver Detection event %d on TX queue %d PF# %d VF# %d\n",
                                  event, queue, pf_num, vf_num);
                 wr32(hw, GL_MDET_TX_TCLAN, 0xffffffff);
@@ -1335,8 +1325,7 @@ static void ice_handle_mdd_event(struct ice_pf *pf)
                         vf->num_mdd_events++;
                         if (vf->num_mdd_events &&
                             vf->num_mdd_events <= ICE_MDD_EVENTS_THRESHOLD)
-                               dev_info(dev,
-                                        "VF %d has had %llu MDD events since last boot, Admin might need to reload AVF driver with this number of events\n",
+                               dev_info(dev, "VF %d has had %llu MDD events since last boot, Admin might need to reload AVF driver with this number of events\n",
                                          i, vf->num_mdd_events);
                 }
         }
@@ -1367,7 +1356,7 @@ static int ice_force_phys_link_state(struct ice_vsi *vsi, bool link_up)
         if (vsi->type != ICE_VSI_PF)
                 return 0;
  
-       dev = &vsi->back->pdev->dev;
+       dev = ice_pf_to_dev(vsi->back);
  
         pi = vsi->port_info;
  
@@ -1378,8 +1367,7 @@ static int ice_force_phys_link_state(struct ice_vsi *vsi, bool link_up)
         retcode = ice_aq_get_phy_caps(pi, false, ICE_AQC_REPORT_SW_CFG, pcaps,
                                       NULL);
         if (retcode) {
-               dev_err(dev,
-                       "Failed to get phy capabilities, VSI %d error %d\n",
+               dev_err(dev, "Failed to get phy capabilities, VSI %d error %d\n",
                         vsi->vsi_num, retcode);
                 retcode = -EIO;
                 goto out;
@@ -1649,8 +1637,8 @@ static int ice_vsi_req_irq_msix(struct ice_vsi *vsi, char *basename)
                 err = devm_request_irq(dev, irq_num, vsi->irq_handler, 0,
                                        q_vector->name, q_vector);
                 if (err) {
-                       netdev_err(vsi->netdev,
-                                  "MSIX request_irq failed, error: %d\n", err);
+                       netdev_err(vsi->netdev, "MSIX request_irq failed, error: %d\n",
+                                  err);
                         goto free_q_irqs;
                 }
  
@@ -1685,7 +1673,7 @@ free_q_irqs:
   */
  static int ice_xdp_alloc_setup_rings(struct ice_vsi *vsi)
  {
-       struct device *dev = &vsi->back->pdev->dev;
+       struct device *dev = ice_pf_to_dev(vsi->back);
         int i;
  
         for (i = 0; i < vsi->num_xdp_txq; i++) {
@@ -2664,14 +2652,12 @@ static void ice_set_pf_caps(struct ice_pf *pf)
         clear_bit(ICE_FLAG_DCB_CAPABLE, pf->flags);
         if (func_caps->common_cap.dcb)
                 set_bit(ICE_FLAG_DCB_CAPABLE, pf->flags);
-#ifdef CONFIG_PCI_IOV
         clear_bit(ICE_FLAG_SRIOV_CAPABLE, pf->flags);
         if (func_caps->common_cap.sr_iov_1_1) {
                 set_bit(ICE_FLAG_SRIOV_CAPABLE, pf->flags);
                 pf->num_vfs_supported = min_t(int, func_caps->num_allocd_vfs,
                                               ICE_MAX_VF_COUNT);
         }
-#endif /* CONFIG_PCI_IOV */
         clear_bit(ICE_FLAG_RSS_ENA, pf->flags);
         if (func_caps->common_cap.rss_table_size)
                 set_bit(ICE_FLAG_RSS_ENA, pf->flags);
@@ -2764,8 +2750,7 @@ static int ice_ena_msix_range(struct ice_pf *pf)
         }
  
         if (v_actual < v_budget) {
-               dev_warn(dev,
-                        "not enough OS MSI-X vectors. requested = %d, obtained = %d\n",
+               dev_warn(dev, "not enough OS MSI-X vectors. requested = %d, obtained = %d\n",
                          v_budget, v_actual);
  /* 2 vectors for LAN (traffic + OICR) */
  #define ICE_MIN_LAN_VECS 2
@@ -2787,8 +2772,7 @@ msix_err:
         goto exit_err;
  
  no_hw_vecs_left_err:
-       dev_err(dev,
-               "not enough device MSI-X vectors. requested = %d, available = %d\n",
+       dev_err(dev, "not enough device MSI-X vectors. requested = %d, available = %d\n",
                 needed, v_left);
         err = -ERANGE;
  exit_err:
@@ -2921,16 +2905,14 @@ ice_log_pkg_init(struct ice_hw *hw, enum ice_status *status)
                     !memcmp(hw->pkg_name, hw->active_pkg_name,
                             sizeof(hw->pkg_name))) {
                         if (hw->pkg_dwnld_status == ICE_AQ_RC_EEXIST)
-                               dev_info(dev,
-                                        "DDP package already present on device: %s version %d.%d.%d.%d\n",
+                               dev_info(dev, "DDP package already present on device: %s version %d.%d.%d.%d\n",
                                          hw->active_pkg_name,
                                          hw->active_pkg_ver.major,
                                          hw->active_pkg_ver.minor,
                                          hw->active_pkg_ver.update,
                                          hw->active_pkg_ver.draft);
                         else
-                               dev_info(dev,
-                                        "The DDP package was successfully loaded: %s version %d.%d.%d.%d\n",
+                               dev_info(dev, "The DDP package was successfully loaded: %s version %d.%d.%d.%d\n",
                                          hw->active_pkg_name,
                                          hw->active_pkg_ver.major,
                                          hw->active_pkg_ver.minor,
@@ -2938,8 +2920,7 @@ ice_log_pkg_init(struct ice_hw *hw, enum ice_status *status)
                                          hw->active_pkg_ver.draft);
                 } else if (hw->active_pkg_ver.major != ICE_PKG_SUPP_VER_MAJ ||
                            hw->active_pkg_ver.minor != ICE_PKG_SUPP_VER_MNR) {
-                       dev_err(dev,
-                               "The device has a DDP package that is not supported by the driver.  The device has package '%s' version %d.%d.x.x.  The driver requires version %d.%d.x.x.  Entering Safe Mode.\n",
+                       dev_err(dev, "The device has a DDP package that is not supported by the driver.  The device has package '%s' version %d.%d.x.x.  The driver requires version %d.%d.x.x.  Entering Safe Mode.\n",
                                 hw->active_pkg_name,
                                 hw->active_pkg_ver.major,
                                 hw->active_pkg_ver.minor,
@@ -2947,8 +2928,7 @@ ice_log_pkg_init(struct ice_hw *hw, enum ice_status *status)
                         *status = ICE_ERR_NOT_SUPPORTED;
                 } else if (hw->active_pkg_ver.major == ICE_PKG_SUPP_VER_MAJ &&
                            hw->active_pkg_ver.minor == ICE_PKG_SUPP_VER_MNR) {
-                       dev_info(dev,
-                                "The driver could not load the DDP package file because a compatible DDP package is already present on the device.  The device has package '%s' version %d.%d.%d.%d.  The package file found by the driver: '%s' version %d.%d.%d.%d.\n",
+                       dev_info(dev, "The driver could not load the DDP package file because a compatible DDP package is already present on the device.  The device has package '%s' version %d.%d.%d.%d.  The package file found by the driver: '%s' version %d.%d.%d.%d.\n",
                                  hw->active_pkg_name,
                                  hw->active_pkg_ver.major,
                                  hw->active_pkg_ver.minor,
@@ -2960,54 +2940,46 @@ ice_log_pkg_init(struct ice_hw *hw, enum ice_status *status)
                                  hw->pkg_ver.update,
                                  hw->pkg_ver.draft);
                 } else {
-                       dev_err(dev,
-                               "An unknown error occurred when loading the DDP package, please reboot the system.  If the problem persists, update the NVM.  Entering Safe Mode.\n");
+                       dev_err(dev, "An unknown error occurred when loading the DDP package, please reboot the system.  If the problem persists, update the NVM.  Entering Safe Mode.\n");
                         *status = ICE_ERR_NOT_SUPPORTED;
                 }
                 break;
         case ICE_ERR_BUF_TOO_SHORT:
                 /* fall-through */
         case ICE_ERR_CFG:
-               dev_err(dev,
-                       "The DDP package file is invalid. Entering Safe Mode.\n");
+               dev_err(dev, "The DDP package file is invalid. Entering Safe Mode.\n");
                 break;
         case ICE_ERR_NOT_SUPPORTED:
                 /* Package File version not supported */
                 if (hw->pkg_ver.major > ICE_PKG_SUPP_VER_MAJ ||
                     (hw->pkg_ver.major == ICE_PKG_SUPP_VER_MAJ &&
                      hw->pkg_ver.minor > ICE_PKG_SUPP_VER_MNR))
-                       dev_err(dev,
-                               "The DDP package file version is higher than the driver supports.  Please use an updated driver.  Entering Safe Mode.\n");
+                       dev_err(dev, "The DDP package file version is higher than the driver supports.  Please use an updated driver.  Entering Safe Mode.\n");
                 else if (hw->pkg_ver.major < ICE_PKG_SUPP_VER_MAJ ||
                          (hw->pkg_ver.major == ICE_PKG_SUPP_VER_MAJ &&
                           hw->pkg_ver.minor < ICE_PKG_SUPP_VER_MNR))
-                       dev_err(dev,
-                               "The DDP package file version is lower than the driver supports.  The driver requires version %d.%d.x.x.  Please use an updated DDP Package file.  Entering Safe Mode.\n",
+                       dev_err(dev, "The DDP package file version is lower than the driver supports.  The driver requires version %d.%d.x.x.  Please use an updated DDP Package file.  Entering Safe Mode.\n",
                                 ICE_PKG_SUPP_VER_MAJ, ICE_PKG_SUPP_VER_MNR);
                 break;
         case ICE_ERR_AQ_ERROR:
                 switch (hw->pkg_dwnld_status) {
                 case ICE_AQ_RC_ENOSEC:
                 case ICE_AQ_RC_EBADSIG:
-                       dev_err(dev,
-                               "The DDP package could not be loaded because its signature is not valid.  Please use a valid DDP Package.  Entering Safe Mode.\n");
+                       dev_err(dev, "The DDP package could not be loaded because its signature is not valid.  Please use a valid DDP Package.  Entering Safe Mode.\n");
                         return;
                 case ICE_AQ_RC_ESVN:
-                       dev_err(dev,
-                               "The DDP Package could not be loaded because its security revision is too low.  Please use an updated DDP Package.  Entering Safe Mode.\n");
+                       dev_err(dev, "The DDP Package could not be loaded because its security revision is too low.  Please use an updated DDP Package.  Entering Safe Mode.\n");
                         return;
                 case ICE_AQ_RC_EBADMAN:
                 case ICE_AQ_RC_EBADBUF:
-                       dev_err(dev,
-                               "An error occurred on the device while loading the DDP package.  The device will be reset.\n");
+                       dev_err(dev, "An error occurred on the device while loading the DDP package.  The device will be reset.\n");
                         return;
                 default:
                         break;
                 }
                 /* fall-through */
         default:
-               dev_err(dev,
-                       "An unknown error (%d) occurred when loading the DDP package.  Entering Safe Mode.\n",
+               dev_err(dev, "An unknown error (%d) occurred when loading the DDP package.  Entering Safe Mode.\n",
                         *status);
                 break;
         }
@@ -3038,8 +3010,7 @@ ice_load_pkg(const struct firmware *firmware, struct ice_pf *pf)
                 status = ice_init_pkg(hw, hw->pkg_copy, hw->pkg_size);
                 ice_log_pkg_init(hw, &status);
         } else {
-               dev_err(dev,
-                       "The DDP package file failed to load. Entering Safe Mode.\n");
+               dev_err(dev, "The DDP package file failed to load. Entering Safe Mode.\n");
         }
  
         if (status) {
@@ -3065,8 +3036,7 @@ ice_load_pkg(const struct firmware *firmware, struct ice_pf *pf)
  static void ice_verify_cacheline_size(struct ice_pf *pf)
  {
         if (rd32(&pf->hw, GLPCI_CNF2) & GLPCI_CNF2_CACHELINE_SIZE_M)
-               dev_warn(ice_pf_to_dev(pf),
-                        "%d Byte cache line assumption is invalid, driver may have Tx timeouts!\n",
+               dev_warn(ice_pf_to_dev(pf), "%d Byte cache line assumption is invalid, driver may have Tx timeouts!\n",
                          ICE_CACHE_LINE_BYTES);
  }
  
@@ -3159,8 +3129,7 @@ static void ice_request_fw(struct ice_pf *pf)
  dflt_pkg_load:
         err = request_firmware(&firmware, ICE_DDP_PKG_FILE, dev);
         if (err) {
-               dev_err(dev,
-                       "The DDP package file was not found or could not be read. Entering Safe Mode\n");
+               dev_err(dev, "The DDP package file was not found or could not be read. Entering Safe Mode\n");
                 return;
         }
  
@@ -3184,7 +3153,9 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
         struct ice_hw *hw;
         int err;
  
-       /* this driver uses devres, see Documentation/driver-api/driver-model/devres.rst */
+       /* this driver uses devres, see
+        * Documentation/driver-api/driver-model/devres.rst
+        */
         err = pcim_enable_device(pdev);
         if (err)
                 return err;
@@ -3245,11 +3216,6 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
                 goto err_exit_unroll;
         }
  
-       dev_info(dev, "firmware %d.%d.%d api %d.%d.%d nvm %s build 0x%08x\n",
-                hw->fw_maj_ver, hw->fw_min_ver, hw->fw_patch,
-                hw->api_maj_ver, hw->api_min_ver, hw->api_patch,
-                ice_nvm_version_str(hw), hw->fw_build);
-
         ice_request_fw(pf);
  
         /* if ice_request_fw fails, ICE_FLAG_ADV_FEATURES bit won't be
@@ -3257,8 +3223,7 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
          * true
          */
         if (ice_is_safe_mode(pf)) {
-               dev_err(dev,
-                       "Package download failed. Advanced features disabled - Device now in Safe Mode\n");
+               dev_err(dev, "Package download failed. Advanced features disabled - Device now in Safe Mode\n");
                 /* we already got function/device capabilities but these don't
                  * reflect what the driver needs to do in safe mode. Instead of
                  * adding conditional logic everywhere to ignore these
@@ -3335,8 +3300,7 @@ ice_probe(struct pci_dev *pdev, const struct pci_device_id __always_unused *ent)
         /* tell the firmware we are up */
         err = ice_send_version(pf);
         if (err) {
-               dev_err(dev,
-                       "probe failed sending driver version %s. error: %d\n",
+               dev_err(dev, "probe failed sending driver version %s. error: %d\n",
                         ice_drv_ver, err);
                 goto err_alloc_sw_unroll;
         }
@@ -3477,8 +3441,7 @@ static pci_ers_result_t ice_pci_err_slot_reset(struct pci_dev *pdev)
  
         err = pci_enable_device_mem(pdev);
         if (err) {
-               dev_err(&pdev->dev,
-                       "Cannot re-enable PCI device after reset, error %d\n",
+               dev_err(&pdev->dev, "Cannot re-enable PCI device after reset, error %d\n",
                         err);
                 result = PCI_ERS_RESULT_DISCONNECT;
         } else {
@@ -3497,8 +3460,7 @@ static pci_ers_result_t ice_pci_err_slot_reset(struct pci_dev *pdev)
  
         err = pci_cleanup_aer_uncorrect_error_status(pdev);
         if (err)
-               dev_dbg(&pdev->dev,
-                       "pci_cleanup_aer_uncorrect_error_status failed, error %d\n",
+               dev_dbg(&pdev->dev, "pci_cleanup_aer_uncorrect_error_status failed, error %d\n",
                         err);
                 /* non-fatal, continue */
  
@@ -3517,8 +3479,8 @@ static void ice_pci_err_resume(struct pci_dev *pdev)
         struct ice_pf *pf = pci_get_drvdata(pdev);
  
         if (!pf) {
-               dev_err(&pdev->dev,
-                       "%s failed, device is unrecoverable\n", __func__);
+               dev_err(&pdev->dev, "%s failed, device is unrecoverable\n",
+                       __func__);
                 return;
         }
  
@@ -3766,8 +3728,7 @@ ice_set_tx_maxrate(struct net_device *netdev, int queue_index, u32 maxrate)
  
         /* Validate maxrate requested is within permitted range */
         if (maxrate && (maxrate > (ICE_SCHED_MAX_BW / 1000))) {
-               netdev_err(netdev,
-                          "Invalid max rate %d specified for the queue %d\n",
+               netdev_err(netdev, "Invalid max rate %d specified for the queue %d\n",
                            maxrate, queue_index);
                 return -EINVAL;
         }
@@ -3783,8 +3744,8 @@ ice_set_tx_maxrate(struct net_device *netdev, int queue_index, u32 maxrate)
                 status = ice_cfg_q_bw_lmt(vsi->port_info, vsi->idx, tc,
                                           q_handle, ICE_MAX_BW, maxrate * 1000);
         if (status) {
-               netdev_err(netdev,
-                          "Unable to set Tx max rate, error %d\n", status);
+               netdev_err(netdev, "Unable to set Tx max rate, error %d\n",
+                          status);
                 return -EIO;
         }
  
@@ -3876,15 +3837,13 @@ ice_set_features(struct net_device *netdev, netdev_features_t features)
  
         /* Don't set any netdev advanced features with device in Safe Mode */
         if (ice_is_safe_mode(vsi->back)) {
-               dev_err(&vsi->back->pdev->dev,
-                       "Device is in Safe Mode - not enabling advanced netdev features\n");
+               dev_err(ice_pf_to_dev(vsi->back), "Device is in Safe Mode - not enabling advanced netdev features\n");
                 return ret;
         }
  
         /* Do not change setting during reset */
         if (ice_is_reset_in_progress(pf->state)) {
-               dev_err(&vsi->back->pdev->dev,
-                       "Device is resetting, changing advanced netdev features temporarily unavailable.\n");
+               dev_err(ice_pf_to_dev(vsi->back), "Device is resetting, changing advanced netdev features temporarily unavailable.\n");
                 return -EBUSY;
         }
  
@@ -4372,21 +4331,18 @@ int ice_down(struct ice_vsi *vsi)
  
         tx_err = ice_vsi_stop_lan_tx_rings(vsi, ICE_NO_RESET, 0);
         if (tx_err)
-               netdev_err(vsi->netdev,
-                          "Failed stop Tx rings, VSI %d error %d\n",
+               netdev_err(vsi->netdev, "Failed stop Tx rings, VSI %d error %d\n",
                            vsi->vsi_num, tx_err);
         if (!tx_err && ice_is_xdp_ena_vsi(vsi)) {
                 tx_err = ice_vsi_stop_xdp_tx_rings(vsi);
                 if (tx_err)
-                       netdev_err(vsi->netdev,
-                                  "Failed stop XDP rings, VSI %d error %d\n",
+                       netdev_err(vsi->netdev, "Failed stop XDP rings, VSI %d error %d\n",
                                    vsi->vsi_num, tx_err);
         }
  
         rx_err = ice_vsi_stop_rx_rings(vsi);
         if (rx_err)
-               netdev_err(vsi->netdev,
-                          "Failed stop Rx rings, VSI %d error %d\n",
+               netdev_err(vsi->netdev, "Failed stop Rx rings, VSI %d error %d\n",
                            vsi->vsi_num, rx_err);
  
         ice_napi_disable_all(vsi);
@@ -4394,8 +4350,7 @@ int ice_down(struct ice_vsi *vsi)
         if (test_bit(ICE_FLAG_LINK_DOWN_ON_CLOSE_ENA, vsi->back->flags)) {
                 link_err = ice_force_phys_link_state(vsi, false);
                 if (link_err)
-                       netdev_err(vsi->netdev,
-                                  "Failed to set physical link down, VSI %d error %d\n",
+                       netdev_err(vsi->netdev, "Failed to set physical link down, VSI %d error %d\n",
                                    vsi->vsi_num, link_err);
         }
  
@@ -4406,8 +4361,7 @@ int ice_down(struct ice_vsi *vsi)
                 ice_clean_rx_ring(vsi->rx_rings[i]);
  
         if (tx_err || rx_err || link_err) {
-               netdev_err(vsi->netdev,
-                          "Failed to close VSI 0x%04X on switch 0x%04X\n",
+               netdev_err(vsi->netdev, "Failed to close VSI 0x%04X on switch 0x%04X\n",
                            vsi->vsi_num, vsi->vsw->sw_id);
                 return -EIO;
         }
@@ -4426,7 +4380,7 @@ int ice_vsi_setup_tx_rings(struct ice_vsi *vsi)
         int i, err = 0;
  
         if (!vsi->num_txq) {
-               dev_err(&vsi->back->pdev->dev, "VSI %d has 0 Tx queues\n",
+               dev_err(ice_pf_to_dev(vsi->back), "VSI %d has 0 Tx queues\n",
                         vsi->vsi_num);
                 return -EINVAL;
         }
@@ -4457,7 +4411,7 @@ int ice_vsi_setup_rx_rings(struct ice_vsi *vsi)
         int i, err = 0;
  
         if (!vsi->num_rxq) {
-               dev_err(&vsi->back->pdev->dev, "VSI %d has 0 Rx queues\n",
+               dev_err(ice_pf_to_dev(vsi->back), "VSI %d has 0 Rx queues\n",
                         vsi->vsi_num);
                 return -EINVAL;
         }
@@ -4554,8 +4508,7 @@ static void ice_vsi_release_all(struct ice_pf *pf)
  
                 err = ice_vsi_release(pf->vsi[i]);
                 if (err)
-                       dev_dbg(ice_pf_to_dev(pf),
-                               "Failed to release pf->vsi[%d], err %d, vsi_num = %d\n",
+                       dev_dbg(ice_pf_to_dev(pf), "Failed to release pf->vsi[%d], err %d, vsi_num = %d\n",
                                 i, err, pf->vsi[i]->vsi_num);
         }
  }
@@ -4582,8 +4535,7 @@ static int ice_vsi_rebuild_by_type(struct ice_pf *pf, enum ice_vsi_type type)
                 /* rebuild the VSI */
                 err = ice_vsi_rebuild(vsi, true);
                 if (err) {
-                       dev_err(dev,
-                               "rebuild VSI failed, err %d, VSI index %d, type %s\n",
+                       dev_err(dev, "rebuild VSI failed, err %d, VSI index %d, type %s\n",
                                 err, vsi->idx, ice_vsi_type_str(type));
                         return err;
                 }
@@ -4591,8 +4543,7 @@ static int ice_vsi_rebuild_by_type(struct ice_pf *pf, enum ice_vsi_type type)
                 /* replay filters for the VSI */
                 status = ice_replay_vsi(&pf->hw, vsi->idx);
                 if (status) {
-                       dev_err(dev,
-                               "replay VSI failed, status %d, VSI index %d, type %s\n",
+                       dev_err(dev, "replay VSI failed, status %d, VSI index %d, type %s\n",
                                 status, vsi->idx, ice_vsi_type_str(type));
                         return -EIO;
                 }
@@ -4605,8 +4556,7 @@ static int ice_vsi_rebuild_by_type(struct ice_pf *pf, enum ice_vsi_type type)
                 /* enable the VSI */
                 err = ice_ena_vsi(vsi, false);
                 if (err) {
-                       dev_err(dev,
-                               "enable VSI failed, err %d, VSI index %d, type %s\n",
+                       dev_err(dev, "enable VSI failed, err %d, VSI index %d, type %s\n",
                                 err, vsi->idx, ice_vsi_type_str(type));
                         return err;
                 }
@@ -4684,8 +4634,7 @@ static void ice_rebuild(struct ice_pf *pf, enum ice_reset_req reset_type)
         }
  
         if (pf->first_sw->dflt_vsi_ena)
-               dev_info(dev,
-                        "Clearing default VSI, re-enable after reset completes\n");
+               dev_info(dev, "Clearing default VSI, re-enable after reset completes\n");
         /* clear the default VSI configuration if it exists */
         pf->first_sw->dflt_vsi = NULL;
         pf->first_sw->dflt_vsi_ena = false;
@@ -4736,8 +4685,7 @@ static void ice_rebuild(struct ice_pf *pf, enum ice_reset_req reset_type)
         /* tell the firmware we are up */
         ret = ice_send_version(pf);
         if (ret) {
-               dev_err(dev,
-                       "Rebuild failed due to error sending driver version: %d\n",
+               dev_err(dev, "Rebuild failed due to error sending driver version: %d\n",
                         ret);
                 goto err_vsi_rebuild;
         }
@@ -4993,7 +4941,7 @@ static int ice_vsi_update_bridge_mode(struct ice_vsi *vsi, u16 bmode)
  
         status = ice_update_vsi(hw, vsi->idx, ctxt, NULL);
         if (status) {
-               dev_err(&vsi->back->pdev->dev, "update VSI for bridge mode failed, bmode = %d err %d aq_err %d\n",
+               dev_err(ice_pf_to_dev(vsi->back), "update VSI for bridge mode failed, bmode = %d err %d aq_err %d\n",
                         bmode, status, hw->adminq.sq_last_status);
                 ret = -EIO;
                 goto out;
@@ -5185,8 +5133,7 @@ int ice_open(struct net_device *netdev)
         if (pi->phy.link_info.link_info & ICE_AQ_MEDIA_AVAILABLE) {
                 err = ice_force_phys_link_state(vsi, true);
                 if (err) {
-                       netdev_err(netdev,
-                                  "Failed to set physical link up, error %d\n",
+                       netdev_err(netdev, "Failed to set physical link up, error %d\n",
                                    err);
                         return err;
                 }
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c

index fd17ace6b226edb2b2523ba398b5d2c38371dc04..4de61dbedd36d97fac722d62374287da5651fd03 100644 (file)
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -644,7 +644,7 @@ static bool ice_page_is_reserved(struct page *page)
   * Update the offset within page so that Rx buf will be ready to be reused.
   * For systems with PAGE_SIZE < 8192 this function will flip the page offset
   * so the second half of page assigned to Rx buffer will be used, otherwise
- * the offset is moved by the @size bytes
+ * the offset is moved by "size" bytes
   */
  static void
  ice_rx_buf_adjust_pg_offset(struct ice_rx_buf *rx_buf, unsigned int size)
@@ -1078,8 +1078,6 @@ construct_skb:
                                 skb = ice_build_skb(rx_ring, rx_buf, &xdp);
                         else
                                 skb = ice_construct_skb(rx_ring, rx_buf, &xdp);
-               } else {
-                       skb = ice_construct_skb(rx_ring, rx_buf, &xdp);
                 }
                 /* exit if we failed to retrieve a buffer */
                 if (!skb) {
@@ -1621,11 +1619,11 @@ ice_tx_map(struct ice_ring *tx_ring, struct ice_tx_buf *first,
  {
         u64 td_offset, td_tag, td_cmd;
         u16 i = tx_ring->next_to_use;
-       skb_frag_t *frag;
         unsigned int data_len, size;
         struct ice_tx_desc *tx_desc;
         struct ice_tx_buf *tx_buf;
         struct sk_buff *skb;
+       skb_frag_t *frag;
         dma_addr_t dma;
  
         td_tag = off->td_l2tag1;
@@ -1738,9 +1736,8 @@ ice_tx_map(struct ice_ring *tx_ring, struct ice_tx_buf *first,
         ice_maybe_stop_tx(tx_ring, DESC_NEEDED);
  
         /* notify HW of packet */
-       if (netif_xmit_stopped(txring_txq(tx_ring)) || !netdev_xmit_more()) {
+       if (netif_xmit_stopped(txring_txq(tx_ring)) || !netdev_xmit_more())
                 writel(i, tx_ring->tail);
-       }
  
         return;
  
@@ -2078,7 +2075,7 @@ static bool __ice_chk_linearize(struct sk_buff *skb)
         frag = &skb_shinfo(skb)->frags[0];
  
         /* Initialize size to the negative value of gso_size minus 1. We
-        * use this as the worst case scenerio in which the frag ahead
+        * use this as the worst case scenario in which the frag ahead
          * of us only provides one byte which is why we are limited to 6
          * descriptors for a single transmit as the header and previous
          * fragment are already consuming 2 descriptors.
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h

index a86270696df1c027d9070f620e741829bace25d9..7ee00a1286634a5eed591f5d22c5a29ebbc79cf3 100644 (file)
--- a/drivers/net/ethernet/intel/ice/ice_txrx.h
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.h
@@ -33,8 +33,8 @@
   * frame.
   *
   * Note: For cache line sizes 256 or larger this value is going to end
- *       up negative.  In these cases we should fall back to the legacy
- *       receive path.
+ *      up negative.  In these cases we should fall back to the legacy
+ *      receive path.
   */
  #if (PAGE_SIZE < 8192)
  #define ICE_2K_TOO_SMALL_WITH_PADDING \
@@ -222,7 +222,7 @@ enum ice_rx_dtype {
  #define ICE_ITR_GRAN_S         1       /* ITR granularity is always 2us */
  #define ICE_ITR_GRAN_US                BIT(ICE_ITR_GRAN_S)
  #define ICE_ITR_MASK           0x1FFE  /* ITR register value alignment mask */
-#define ITR_REG_ALIGN(setting) __ALIGN_MASK(setting, ~ICE_ITR_MASK)
+#define ITR_REG_ALIGN(setting) ((setting) & ICE_ITR_MASK)
  
  #define ICE_ITR_ADAPTIVE_MIN_INC       0x0002
  #define ICE_ITR_ADAPTIVE_MIN_USECS     0x0002
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c

index 35bbc4ff603cddc4a3f2c5e951d414ff60d41c4a..6da048a6ca7c1b302ae281694807e126cfb85537 100644 (file)
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
@@ -10,7 +10,7 @@
   */
  void ice_release_rx_desc(struct ice_ring *rx_ring, u32 val)
  {
-       u16 prev_ntu = rx_ring->next_to_use;
+       u16 prev_ntu = rx_ring->next_to_use & ~0x7;
  
         rx_ring->next_to_use = val;
  
diff --git a/drivers/net/ethernet/intel/ice/ice_type.h b/drivers/net/ethernet/intel/ice/ice_type.h

index b361ffabb0ca5ea409b3d214449e594dbd3c2854..db0ef6ba907f4930a70280c6a68a388e51e4907e 100644 (file)
--- a/drivers/net/ethernet/intel/ice/ice_type.h
+++ b/drivers/net/ethernet/intel/ice/ice_type.h
@@ -517,7 +517,7 @@ struct ice_hw {
         struct ice_fw_log_cfg fw_log;
  
  /* Device max aggregate bandwidths corresponding to the GL_PWR_MODE_CTL
- * register. Used for determining the ITR/intrl granularity during
+ * register. Used for determining the ITR/INTRL granularity during
   * initialization.
   */
  #define ICE_MAX_AGG_BW_200G    0x0
diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c

index 82b1e7a4cb920e1a4ee59bb1724f9f5007ee4da5..75c70d432c7245e1c76f1b0a39cdfa4be5d62dc9 100644 (file)
--- a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
+++ b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c
@@ -199,8 +199,7 @@ static void ice_dis_vf_mappings(struct ice_vf *vf)
         if (vsi->rx_mapping_mode == ICE_VSI_MAP_CONTIG)
                 wr32(hw, VPLAN_RX_QBASE(vf->vf_id), 0);
         else
-               dev_err(dev,
-                       "Scattered mode for VF Rx queues is not yet implemented\n");
+               dev_err(dev, "Scattered mode for VF Rx queues is not yet implemented\n");
  }
  
  /**
@@ -402,8 +401,7 @@ static void ice_trigger_vf_reset(struct ice_vf *vf, bool is_vflr, bool is_pfr)
                 if ((reg & VF_TRANS_PENDING_M) == 0)
                         break;
  
-               dev_err(dev,
-                       "VF %d PCI transactions stuck\n", vf->vf_id);
+               dev_err(dev, "VF %d PCI transactions stuck\n", vf->vf_id);
                 udelay(ICE_PCI_CIAD_WAIT_DELAY_US);
         }
  }
@@ -462,7 +460,7 @@ static int ice_vsi_manage_pvid(struct ice_vsi *vsi, u16 vid, bool enable)
  
         status = ice_update_vsi(hw, vsi->idx, ctxt, NULL);
         if (status) {
-               dev_info(&vsi->back->pdev->dev, "update VSI for port VLAN failed, err %d aq_err %d\n",
+               dev_info(ice_pf_to_dev(vsi->back), "update VSI for port VLAN failed, err %d aq_err %d\n",
                          status, hw->adminq.sq_last_status);
                 ret = -EIO;
                 goto out;
@@ -1095,7 +1093,6 @@ bool ice_reset_all_vfs(struct ice_pf *pf, bool is_vflr)
          * finished resetting.
          */
         for (i = 0, v = 0; i < 10 && v < pf->num_alloc_vfs; i++) {
-
                 /* Check each VF in sequence */
                 while (v < pf->num_alloc_vfs) {
                         u32 reg;
@@ -1553,8 +1550,7 @@ ice_vc_send_msg_to_vf(struct ice_vf *vf, u32 v_opcode,
                 dev_info(dev, "VF %d failed opcode %d, retval: %d\n", vf->vf_id,
                          v_opcode, v_retval);
                 if (vf->num_inval_msgs > ICE_DFLT_NUM_INVAL_MSGS_ALLOWED) {
-                       dev_err(dev,
-                               "Number of invalid messages exceeded for VF %d\n",
+                       dev_err(dev, "Number of invalid messages exceeded for VF %d\n",
                                 vf->vf_id);
                         dev_err(dev, "Use PF Control I/F to enable the VF\n");
                         set_bit(ICE_VF_STATE_DIS, vf->vf_states);
@@ -1569,8 +1565,7 @@ ice_vc_send_msg_to_vf(struct ice_vf *vf, u32 v_opcode,
         aq_ret = ice_aq_send_msg_to_vf(&pf->hw, vf->vf_id, v_opcode, v_retval,
                                        msg, msglen, NULL);
         if (aq_ret && pf->hw.mailboxq.sq_last_status != ICE_AQ_RC_ENOSYS) {
-               dev_info(dev,
-                        "Unable to send the message to VF %d ret %d aq_err %d\n",
+               dev_info(dev, "Unable to send the message to VF %d ret %d aq_err %d\n",
                          vf->vf_id, aq_ret, pf->hw.mailboxq.sq_last_status);
                 return -EIO;
         }
@@ -1878,6 +1873,48 @@ error_param:
                                      NULL, 0);
  }
  
+/**
+ * ice_wait_on_vf_reset - poll to make sure a given VF is ready after reset
+ * @vf: The VF being resseting
+ *
+ * The max poll time is about ~800ms, which is about the maximum time it takes
+ * for a VF to be reset and/or a VF driver to be removed.
+ */
+static void ice_wait_on_vf_reset(struct ice_vf *vf)
+{
+       int i;
+
+       for (i = 0; i < ICE_MAX_VF_RESET_TRIES; i++) {
+               if (test_bit(ICE_VF_STATE_INIT, vf->vf_states))
+                       break;
+               msleep(ICE_MAX_VF_RESET_SLEEP_MS);
+       }
+}
+
+/**
+ * ice_check_vf_ready_for_cfg - check if VF is ready to be configured/queried
+ * @vf: VF to check if it's ready to be configured/queried
+ *
+ * The purpose of this function is to make sure the VF is not in reset, not
+ * disabled, and initialized so it can be configured and/or queried by a host
+ * administrator.
+ */
+static int ice_check_vf_ready_for_cfg(struct ice_vf *vf)
+{
+       struct ice_pf *pf;
+
+       ice_wait_on_vf_reset(vf);
+
+       if (ice_is_vf_disabled(vf))
+               return -EINVAL;
+
+       pf = vf->pf;
+       if (ice_check_vf_init(pf, vf))
+               return -EBUSY;
+
+       return 0;
+}
+
  /**
   * ice_set_vf_spoofchk
   * @netdev: network interface device structure
@@ -1895,16 +1932,16 @@ int ice_set_vf_spoofchk(struct net_device *netdev, int vf_id, bool ena)
         enum ice_status status;
         struct device *dev;
         struct ice_vf *vf;
-       int ret = 0;
+       int ret;
  
         dev = ice_pf_to_dev(pf);
         if (ice_validate_vf_id(pf, vf_id))
                 return -EINVAL;
  
         vf = &pf->vf[vf_id];
-
-       if (ice_check_vf_init(pf, vf))
-               return -EBUSY;
+       ret = ice_check_vf_ready_for_cfg(vf);
+       if (ret)
+               return ret;
  
         vf_vsi = pf->vsi[vf->lan_vsi_idx];
         if (!vf_vsi) {
@@ -1914,8 +1951,7 @@ int ice_set_vf_spoofchk(struct net_device *netdev, int vf_id, bool ena)
         }
  
         if (vf_vsi->type != ICE_VSI_VF) {
-               netdev_err(netdev,
-                          "Type %d of VSI %d for VF %d is no ICE_VSI_VF\n",
+               netdev_err(netdev, "Type %d of VSI %d for VF %d is no ICE_VSI_VF\n",
                            vf_vsi->type, vf_vsi->vsi_num, vf->vf_id);
                 return -ENODEV;
         }
@@ -1945,8 +1981,7 @@ int ice_set_vf_spoofchk(struct net_device *netdev, int vf_id, bool ena)
  
         status = ice_update_vsi(&pf->hw, vf_vsi->idx, ctx, NULL);
         if (status) {
-               dev_err(dev,
-                       "Failed to %sable spoofchk on VF %d VSI %d\n error %d",
+               dev_err(dev, "Failed to %sable spoofchk on VF %d VSI %d\n error %d",
                         ena ? "en" : "dis", vf->vf_id, vf_vsi->vsi_num, status);
                 ret = -EIO;
                 goto out;
@@ -2063,8 +2098,7 @@ static int ice_vc_ena_qs_msg(struct ice_vf *vf, u8 *msg)
                         continue;
  
                 if (ice_vsi_ctrl_rx_ring(vsi, true, vf_q_id)) {
-                       dev_err(&vsi->back->pdev->dev,
-                               "Failed to enable Rx ring %d on VSI %d\n",
+                       dev_err(ice_pf_to_dev(vsi->back), "Failed to enable Rx ring %d on VSI %d\n",
                                 vf_q_id, vsi->vsi_num);
                         v_ret = VIRTCHNL_STATUS_ERR_PARAM;
                         goto error_param;
@@ -2166,8 +2200,7 @@ static int ice_vc_dis_qs_msg(struct ice_vf *vf, u8 *msg)
  
                         if (ice_vsi_stop_tx_ring(vsi, ICE_NO_RESET, vf->vf_id,
                                                  ring, &txq_meta)) {
-                               dev_err(&vsi->back->pdev->dev,
-                                       "Failed to stop Tx ring %d on VSI %d\n",
+                               dev_err(ice_pf_to_dev(vsi->back), "Failed to stop Tx ring %d on VSI %d\n",
                                         vf_q_id, vsi->vsi_num);
                                 v_ret = VIRTCHNL_STATUS_ERR_PARAM;
                                 goto error_param;
@@ -2193,8 +2226,7 @@ static int ice_vc_dis_qs_msg(struct ice_vf *vf, u8 *msg)
                                 continue;
  
                         if (ice_vsi_ctrl_rx_ring(vsi, false, vf_q_id)) {
-                               dev_err(&vsi->back->pdev->dev,
-                                       "Failed to stop Rx ring %d on VSI %d\n",
+                               dev_err(ice_pf_to_dev(vsi->back), "Failed to stop Rx ring %d on VSI %d\n",
                                         vf_q_id, vsi->vsi_num);
                                 v_ret = VIRTCHNL_STATUS_ERR_PARAM;
                                 goto error_param;
@@ -2357,8 +2389,7 @@ static int ice_vc_cfg_qs_msg(struct ice_vf *vf, u8 *msg)
  
         if (qci->num_queue_pairs > ICE_MAX_BASE_QS_PER_VF ||
             qci->num_queue_pairs > min_t(u16, vsi->alloc_txq, vsi->alloc_rxq)) {
-               dev_err(ice_pf_to_dev(pf),
-                       "VF-%d requesting more than supported number of queues: %d\n",
+               dev_err(ice_pf_to_dev(pf), "VF-%d requesting more than supported number of queues: %d\n",
                         vf->vf_id, min_t(u16, vsi->alloc_txq, vsi->alloc_rxq));
                 v_ret = VIRTCHNL_STATUS_ERR_PARAM;
                 goto error_param;
@@ -2570,8 +2601,7 @@ ice_vc_handle_mac_addr_msg(struct ice_vf *vf, u8 *msg, bool set)
          */
         if (set && !ice_is_vf_trusted(vf) &&
             (vf->num_mac + al->num_elements) > ICE_MAX_MACADDR_PER_VF) {
-               dev_err(ice_pf_to_dev(pf),
-                       "Can't add more MAC addresses, because VF-%d is not trusted, switch the VF to trusted mode in order to add more functionalities\n",
+               dev_err(ice_pf_to_dev(pf), "Can't add more MAC addresses, because VF-%d is not trusted, switch the VF to trusted mode in order to add more functionalities\n",
                         vf->vf_id);
                 v_ret = VIRTCHNL_STATUS_ERR_PARAM;
                 goto handle_mac_exit;
@@ -2648,8 +2678,8 @@ static int ice_vc_request_qs_msg(struct ice_vf *vf, u8 *msg)
         struct ice_pf *pf = vf->pf;
         u16 max_allowed_vf_queues;
         u16 tx_rx_queue_left;
-       u16 cur_queues;
         struct device *dev;
+       u16 cur_queues;
  
         dev = ice_pf_to_dev(pf);
         if (!test_bit(ICE_VF_STATE_ACTIVE, vf->vf_states)) {
@@ -2670,8 +2700,7 @@ static int ice_vc_request_qs_msg(struct ice_vf *vf, u8 *msg)
                 vfres->num_queue_pairs = ICE_MAX_BASE_QS_PER_VF;
         } else if (req_queues > cur_queues &&
                    req_queues - cur_queues > tx_rx_queue_left) {
-               dev_warn(dev,
-                        "VF %d requested %u more queues, but only %u left.\n",
+               dev_warn(dev, "VF %d requested %u more queues, but only %u left.\n",
                          vf->vf_id, req_queues - cur_queues, tx_rx_queue_left);
                 vfres->num_queue_pairs = min_t(u16, max_allowed_vf_queues,
                                                ICE_MAX_BASE_QS_PER_VF);
@@ -2709,7 +2738,7 @@ ice_set_vf_port_vlan(struct net_device *netdev, int vf_id, u16 vlan_id, u8 qos,
         struct ice_vsi *vsi;
         struct device *dev;
         struct ice_vf *vf;
-       int ret = 0;
+       int ret;
  
         dev = ice_pf_to_dev(pf);
         if (ice_validate_vf_id(pf, vf_id))
@@ -2727,13 +2756,15 @@ ice_set_vf_port_vlan(struct net_device *netdev, int vf_id, u16 vlan_id, u8 qos,
  
         vf = &pf->vf[vf_id];
         vsi = pf->vsi[vf->lan_vsi_idx];
-       if (ice_check_vf_init(pf, vf))
-               return -EBUSY;
+
+       ret = ice_check_vf_ready_for_cfg(vf);
+       if (ret)
+               return ret;
  
         if (le16_to_cpu(vsi->info.pvid) == vlanprio) {
                 /* duplicate request, so just return success */
                 dev_dbg(dev, "Duplicate pvid %d request\n", vlanprio);
-               return ret;
+               return 0;
         }
  
         /* If PVID, then remove all filters on the old VLAN */
@@ -2744,7 +2775,7 @@ ice_set_vf_port_vlan(struct net_device *netdev, int vf_id, u16 vlan_id, u8 qos,
         if (vlan_id || qos) {
                 ret = ice_vsi_manage_pvid(vsi, vlanprio, true);
                 if (ret)
-                       goto error_set_pvid;
+                       return ret;
         } else {
                 ice_vsi_manage_pvid(vsi, 0, false);
                 vsi->info.pvid = 0;
@@ -2757,7 +2788,7 @@ ice_set_vf_port_vlan(struct net_device *netdev, int vf_id, u16 vlan_id, u8 qos,
                 /* add new VLAN filter for each MAC */
                 ret = ice_vsi_add_vlan(vsi, vlan_id);
                 if (ret)
-                       goto error_set_pvid;
+                       return ret;
         }
  
         /* The Port VLAN needs to be saved across resets the same as the
@@ -2765,8 +2796,7 @@ ice_set_vf_port_vlan(struct net_device *netdev, int vf_id, u16 vlan_id, u8 qos,
          */
         vf->port_vlan_id = le16_to_cpu(vsi->info.pvid);
  
-error_set_pvid:
-       return ret;
+       return 0;
  }
  
  /**
@@ -2821,8 +2851,8 @@ static int ice_vc_process_vlan_msg(struct ice_vf *vf, u8 *msg, bool add_v)
         for (i = 0; i < vfl->num_elements; i++) {
                 if (vfl->vlan_id[i] > ICE_MAX_VLANID) {
                         v_ret = VIRTCHNL_STATUS_ERR_PARAM;
-                       dev_err(dev,
-                               "invalid VF VLAN id %d\n", vfl->vlan_id[i]);
+                       dev_err(dev, "invalid VF VLAN id %d\n",
+                               vfl->vlan_id[i]);
                         goto error_param;
                 }
         }
@@ -2836,8 +2866,7 @@ static int ice_vc_process_vlan_msg(struct ice_vf *vf, u8 *msg, bool add_v)
  
         if (add_v && !ice_is_vf_trusted(vf) &&
             vsi->num_vlan >= ICE_MAX_VLAN_PER_VF) {
-               dev_info(dev,
-                        "VF-%d is not trusted, switch the VF to trusted mode, in order to add more VLAN addresses\n",
+               dev_info(dev, "VF-%d is not trusted, switch the VF to trusted mode, in order to add more VLAN addresses\n",
                          vf->vf_id);
                 /* There is no need to let VF know about being not trusted,
                  * so we can just return success message here
@@ -2860,8 +2889,7 @@ static int ice_vc_process_vlan_msg(struct ice_vf *vf, u8 *msg, bool add_v)
  
                         if (!ice_is_vf_trusted(vf) &&
                             vsi->num_vlan >= ICE_MAX_VLAN_PER_VF) {
-                               dev_info(dev,
-                                        "VF-%d is not trusted, switch the VF to trusted mode, in order to add more VLAN addresses\n",
+                               dev_info(dev, "VF-%d is not trusted, switch the VF to trusted mode, in order to add more VLAN addresses\n",
                                          vf->vf_id);
                                 /* There is no need to let VF know about being
                                  * not trusted, so we can just return success
@@ -2889,8 +2917,7 @@ static int ice_vc_process_vlan_msg(struct ice_vf *vf, u8 *msg, bool add_v)
                                 status = ice_cfg_vlan_pruning(vsi, true, false);
                                 if (status) {
                                         v_ret = VIRTCHNL_STATUS_ERR_PARAM;
-                                       dev_err(dev,
-                                               "Enable VLAN pruning on VLAN ID: %d failed error-%d\n",
+                                       dev_err(dev, "Enable VLAN pruning on VLAN ID: %d failed error-%d\n",
                                                 vid, status);
                                         goto error_param;
                                 }
@@ -2903,8 +2930,7 @@ static int ice_vc_process_vlan_msg(struct ice_vf *vf, u8 *msg, bool add_v)
                                                              promisc_m, vid);
                                 if (status) {
                                         v_ret = VIRTCHNL_STATUS_ERR_PARAM;
-                                       dev_err(dev,
-                                               "Enable Unicast/multicast promiscuous mode on VLAN ID:%d failed error-%d\n",
+                                       dev_err(dev, "Enable Unicast/multicast promiscuous mode on VLAN ID:%d failed error-%d\n",
                                                 vid, status);
                                 }
                         }
@@ -3140,8 +3166,7 @@ error_handler:
         case VIRTCHNL_OP_GET_VF_RESOURCES:
                 err = ice_vc_get_vf_res_msg(vf, msg);
                 if (ice_vf_init_vlan_stripping(vf))
-                       dev_err(dev,
-                               "Failed to initialize VLAN stripping for VF %d\n",
+                       dev_err(dev, "Failed to initialize VLAN stripping for VF %d\n",
                                 vf->vf_id);
                 ice_vc_notify_vf_link_state(vf);
                 break;
@@ -3254,23 +3279,6 @@ ice_get_vf_cfg(struct net_device *netdev, int vf_id, struct ifla_vf_info *ivi)
         return 0;
  }
  
-/**
- * ice_wait_on_vf_reset
- * @vf: The VF being resseting
- *
- * Poll to make sure a given VF is ready after reset
- */
-static void ice_wait_on_vf_reset(struct ice_vf *vf)
-{
-       int i;
-
-       for (i = 0; i < ICE_MAX_VF_RESET_WAIT; i++) {
-               if (test_bit(ICE_VF_STATE_INIT, vf->vf_states))
-                       break;
-               msleep(20);
-       }
-}
-
  /**
   * ice_set_vf_mac
   * @netdev: network interface device structure
@@ -3283,29 +3291,21 @@ int ice_set_vf_mac(struct net_device *netdev, int vf_id, u8 *mac)
  {
         struct ice_pf *pf = ice_netdev_to_pf(netdev);
         struct ice_vf *vf;
-       int ret = 0;
+       int ret;
  
         if (ice_validate_vf_id(pf, vf_id))
                 return -EINVAL;
  
-       vf = &pf->vf[vf_id];
-       /* Don't set MAC on disabled VF */
-       if (ice_is_vf_disabled(vf))
-               return -EINVAL;
-
-       /* In case VF is in reset mode, wait until it is completed. Depending
-        * on factors like queue disabling routine, this could take ~250ms
-        */
-       ice_wait_on_vf_reset(vf);
-
-       if (ice_check_vf_init(pf, vf))
-               return -EBUSY;
-
         if (is_zero_ether_addr(mac) || is_multicast_ether_addr(mac)) {
                 netdev_err(netdev, "%pM not a valid unicast address\n", mac);
                 return -EINVAL;
         }
  
+       vf = &pf->vf[vf_id];
+       ret = ice_check_vf_ready_for_cfg(vf);
+       if (ret)
+               return ret;
+
         /* copy MAC into dflt_lan_addr and trigger a VF reset. The reset
          * flow will use the updated dflt_lan_addr and add a MAC filter
          * using ice_add_mac. Also set pf_set_mac to indicate that the PF has
@@ -3313,12 +3313,11 @@ int ice_set_vf_mac(struct net_device *netdev, int vf_id, u8 *mac)
          */
         ether_addr_copy(vf->dflt_lan_addr.addr, mac);
         vf->pf_set_mac = true;
-       netdev_info(netdev,
-                   "MAC on VF %d set to %pM. VF driver will be reinitialized\n",
+       netdev_info(netdev, "MAC on VF %d set to %pM. VF driver will be reinitialized\n",
                     vf_id, mac);
  
         ice_vc_reset_vf(vf);
-       return ret;
+       return 0;
  }
  
  /**
@@ -3332,25 +3331,16 @@ int ice_set_vf_mac(struct net_device *netdev, int vf_id, u8 *mac)
  int ice_set_vf_trust(struct net_device *netdev, int vf_id, bool trusted)
  {
         struct ice_pf *pf = ice_netdev_to_pf(netdev);
-       struct device *dev;
         struct ice_vf *vf;
+       int ret;
  
-       dev = ice_pf_to_dev(pf);
         if (ice_validate_vf_id(pf, vf_id))
                 return -EINVAL;
  
         vf = &pf->vf[vf_id];
-       /* Don't set Trusted Mode on disabled VF */
-       if (ice_is_vf_disabled(vf))
-               return -EINVAL;
-
-       /* In case VF is in reset mode, wait until it is completed. Depending
-        * on factors like queue disabling routine, this could take ~250ms
-        */
-       ice_wait_on_vf_reset(vf);
-
-       if (ice_check_vf_init(pf, vf))
-               return -EBUSY;
+       ret = ice_check_vf_ready_for_cfg(vf);
+       if (ret)
+               return ret;
  
         /* Check if already trusted */
         if (trusted == vf->trusted)
@@ -3358,7 +3348,7 @@ int ice_set_vf_trust(struct net_device *netdev, int vf_id, bool trusted)
  
         vf->trusted = trusted;
         ice_vc_reset_vf(vf);
-       dev_info(dev, "VF %u is now %strusted\n",
+       dev_info(ice_pf_to_dev(pf), "VF %u is now %strusted\n",
                  vf_id, trusted ? "" : "un");
  
         return 0;
@@ -3376,13 +3366,15 @@ int ice_set_vf_link_state(struct net_device *netdev, int vf_id, int link_state)
  {
         struct ice_pf *pf = ice_netdev_to_pf(netdev);
         struct ice_vf *vf;
+       int ret;
  
         if (ice_validate_vf_id(pf, vf_id))
                 return -EINVAL;
  
         vf = &pf->vf[vf_id];
-       if (ice_check_vf_init(pf, vf))
-               return -EBUSY;
+       ret = ice_check_vf_ready_for_cfg(vf);
+       if (ret)
+               return ret;
  
         switch (link_state) {
         case IFLA_VF_LINK_STATE_AUTO:
@@ -3418,14 +3410,15 @@ int ice_get_vf_stats(struct net_device *netdev, int vf_id,
         struct ice_eth_stats *stats;
         struct ice_vsi *vsi;
         struct ice_vf *vf;
+       int ret;
  
         if (ice_validate_vf_id(pf, vf_id))
                 return -EINVAL;
  
         vf = &pf->vf[vf_id];
-
-       if (ice_check_vf_init(pf, vf))
-               return -EBUSY;
+       ret = ice_check_vf_ready_for_cfg(vf);
+       if (ret)
+               return ret;
  
         vsi = pf->vsi[vf->lan_vsi_idx];
         if (!vsi)
diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.h b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.h

index 4647d636ed36e07c921f5e4c903f43effde7bc93..ac67982751dfb517dc050331b44a4c580de96de2 100644 (file)
--- a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.h
+++ b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.h
@@ -38,7 +38,8 @@
  #define ICE_MAX_POLICY_INTR_PER_VF     33
  #define ICE_MIN_INTR_PER_VF            (ICE_MIN_QS_PER_VF + 1)
  #define ICE_DFLT_INTR_PER_VF           (ICE_DFLT_QS_PER_VF + 1)
-#define ICE_MAX_VF_RESET_WAIT          15
+#define ICE_MAX_VF_RESET_TRIES         40
+#define ICE_MAX_VF_RESET_SLEEP_MS      20
  
  #define ice_for_each_vf(pf, i) \
         for ((i) = 0; (i) < (pf)->num_alloc_vfs; (i)++)
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c

index 149dca0012bae3ab1097128c414ab79260d0009d..4d3407bbd4c48ded8c956168e794039dd71d6ebe 100644 (file)
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -338,8 +338,8 @@ static int ice_xsk_umem_dma_map(struct ice_vsi *vsi, struct xdp_umem *umem)
                                                     DMA_BIDIRECTIONAL,
                                                     ICE_RX_DMA_ATTR);
                 if (dma_mapping_error(dev, dma)) {
-                       dev_dbg(dev,
-                               "XSK UMEM DMA mapping error on page num %d", i);
+                       dev_dbg(dev, "XSK UMEM DMA mapping error on page num %d\n",
+                               i);
                         goto out_unmap;
                 }
  
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/health.c b/drivers/net/ethernet/mellanox/mlx5/core/en/health.c

index 3a975641f902adbb5da9286e39dbe3bb84c71b61..20b907dc1e297ff697cae883f5c8e6978b1e576c 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/health.c
@@ -200,7 +200,7 @@ int mlx5e_health_report(struct mlx5e_priv *priv,
         netdev_err(priv->netdev, err_str);
  
         if (!reporter)
-               return err_ctx->recover(&err_ctx->ctx);
+               return err_ctx->recover(err_ctx->ctx);
  
         return devlink_health_report(reporter, err_str, err_ctx);
  }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h

index 7c8796d9743fa5a6c66f4aa3e0017b800a121a9b..a226277b09805a75a107e069db46801ffa1c454c 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h
@@ -179,6 +179,14 @@ mlx5e_tx_dma_unmap(struct device *pdev, struct mlx5e_sq_dma *dma)
         }
  }
  
+static inline void mlx5e_rqwq_reset(struct mlx5e_rq *rq)
+{
+       if (rq->wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ)
+               mlx5_wq_ll_reset(&rq->mpwqe.wq);
+       else
+               mlx5_wq_cyc_reset(&rq->wqe.wq);
+}
+
  /* SW parser related functions */
  
  struct mlx5e_swp_spec {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c

index 454d3459bd8b9639f2207dd2b5b0aef08e296aa5..21de4764d4c09b933a39b476b2509c5e3896204b 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -712,6 +712,9 @@ int mlx5e_modify_rq_state(struct mlx5e_rq *rq, int curr_state, int next_state)
         if (!in)
                 return -ENOMEM;
  
+       if (curr_state == MLX5_RQC_STATE_RST && next_state == MLX5_RQC_STATE_RDY)
+               mlx5e_rqwq_reset(rq);
+
         rqc = MLX5_ADDR_OF(modify_rq_in, in, ctx);
  
         MLX5_SET(modify_rq_in, in, rq_state, curr_state);
@@ -5144,7 +5147,6 @@ static void mlx5e_nic_enable(struct mlx5e_priv *priv)
  
  static void mlx5e_nic_disable(struct mlx5e_priv *priv)
  {
-       struct net_device *netdev = priv->netdev;
         struct mlx5_core_dev *mdev = priv->mdev;
  
  #ifdef CONFIG_MLX5_CORE_EN_DCB
@@ -5165,7 +5167,7 @@ static void mlx5e_nic_disable(struct mlx5e_priv *priv)
                 mlx5e_monitor_counter_cleanup(priv);
  
         mlx5e_disable_async_events(priv);
-       mlx5_lag_remove(mdev, netdev);
+       mlx5_lag_remove(mdev);
  }
  
  int mlx5e_update_nic_rx(struct mlx5e_priv *priv)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c

index 7b48ccacebe29bdaf7a46f610869bec31e469b34..6ed307d7f191499af3f80d2f4a5a0cdb48878eb6 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -1861,7 +1861,6 @@ static void mlx5e_uplink_rep_enable(struct mlx5e_priv *priv)
  
  static void mlx5e_uplink_rep_disable(struct mlx5e_priv *priv)
  {
-       struct net_device *netdev = priv->netdev;
         struct mlx5_core_dev *mdev = priv->mdev;
         struct mlx5e_rep_priv *rpriv = priv->ppriv;
  
@@ -1870,7 +1869,7 @@ static void mlx5e_uplink_rep_disable(struct mlx5e_priv *priv)
  #endif
         mlx5_notifier_unregister(mdev, &priv->events_nb);
         cancel_work_sync(&rpriv->uplink_priv.reoffload_flows_work);
-       mlx5_lag_remove(mdev, netdev);
+       mlx5_lag_remove(mdev);
  }
  
  static MLX5E_DEFINE_STATS_GRP(sw_rep, 0);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c

index 5acf60b1bbfed5ba230fb57ec28c418f16ccab44..e49acd0c5da5cf27ffe78b386da4535f4a36091a 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -459,12 +459,16 @@ static void esw_destroy_legacy_table(struct mlx5_eswitch *esw)
  
  static int esw_legacy_enable(struct mlx5_eswitch *esw)
  {
-       int ret;
+       struct mlx5_vport *vport;
+       int ret, i;
  
         ret = esw_create_legacy_table(esw);
         if (ret)
                 return ret;
  
+       mlx5_esw_for_each_vf_vport(esw, i, vport, esw->esw_funcs.num_vfs)
+               vport->info.link_state = MLX5_VPORT_ADMIN_STATE_AUTO;
+
         ret = mlx5_eswitch_enable_pf_vf_vports(esw, MLX5_LEGACY_SRIOV_VPORT_EVENTS);
         if (ret)
                 esw_destroy_legacy_table(esw);
@@ -2452,25 +2456,17 @@ out:
  
  int mlx5_eswitch_get_vepa(struct mlx5_eswitch *esw, u8 *setting)
  {
-       int err = 0;
-
         if (!esw)
                 return -EOPNOTSUPP;
  
         if (!ESW_ALLOWED(esw))
                 return -EPERM;
  
-       mutex_lock(&esw->state_lock);
-       if (esw->mode != MLX5_ESWITCH_LEGACY) {
-               err = -EOPNOTSUPP;
-               goto out;
-       }
+       if (esw->mode != MLX5_ESWITCH_LEGACY)
+               return -EOPNOTSUPP;
  
         *setting = esw->fdb_table.legacy.vepa_uplink_rule ? 1 : 0;
-
-out:
-       mutex_unlock(&esw->state_lock);
-       return err;
+       return 0;
  }
  
  int mlx5_eswitch_set_vport_trust(struct mlx5_eswitch *esw,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c

index 979f13bdc203a48c2b7c53a454d9e41f1389a790..1a57b2bd74b8650c070f1c0ad80b4c7069ad502b 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -1172,7 +1172,7 @@ static int esw_offloads_start(struct mlx5_eswitch *esw,
                 return -EINVAL;
         }
  
-       mlx5_eswitch_disable(esw, true);
+       mlx5_eswitch_disable(esw, false);
         mlx5_eswitch_update_num_of_vfs(esw, esw->dev->priv.sriov.num_vfs);
         err = mlx5_eswitch_enable(esw, MLX5_ESWITCH_OFFLOADS);
         if (err) {
@@ -2065,7 +2065,7 @@ static int esw_offloads_stop(struct mlx5_eswitch *esw,
  {
         int err, err1;
  
-       mlx5_eswitch_disable(esw, true);
+       mlx5_eswitch_disable(esw, false);
         err = mlx5_eswitch_enable(esw, MLX5_ESWITCH_LEGACY);
         if (err) {
                 NL_SET_ERR_MSG_MOD(extack, "Failed setting eswitch to legacy");
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.c

index c5a446e295aa7d9b522cb0a145edd199bd120c2e..4276194b633fd1ef25a00982c9bb5f2ea51b18a5 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.c
@@ -35,7 +35,7 @@
  static const unsigned int ESW_POOLS[] = { 4 * 1024 * 1024,
                                           1 * 1024 * 1024,
                                           64 * 1024,
-                                         4 * 1024, };
+                                         128 };
  
  struct mlx5_esw_chains_priv {
         struct rhashtable chains_ht;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag.c

index b91eabc09fbc187460750f6bc10465ae1b3be944..8e19f6ab8393202c6c15791d9daa220c8536b512 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.c
@@ -464,9 +464,6 @@ static int mlx5_lag_netdev_event(struct notifier_block *this,
         struct mlx5_lag *ldev;
         int changed = 0;
  
-       if (!net_eq(dev_net(ndev), &init_net))
-               return NOTIFY_DONE;
-
         if ((event != NETDEV_CHANGEUPPER) && (event != NETDEV_CHANGELOWERSTATE))
                 return NOTIFY_DONE;
  
@@ -586,8 +583,7 @@ void mlx5_lag_add(struct mlx5_core_dev *dev, struct net_device *netdev)
  
         if (!ldev->nb.notifier_call) {
                 ldev->nb.notifier_call = mlx5_lag_netdev_event;
-               if (register_netdevice_notifier_dev_net(netdev, &ldev->nb,
-                                                       &ldev->nn)) {
+               if (register_netdevice_notifier_net(&init_net, &ldev->nb)) {
                         ldev->nb.notifier_call = NULL;
                         mlx5_core_err(dev, "Failed to register LAG netdev notifier\n");
                 }
@@ -600,7 +596,7 @@ void mlx5_lag_add(struct mlx5_core_dev *dev, struct net_device *netdev)
  }
  
  /* Must be called with intf_mutex held */
-void mlx5_lag_remove(struct mlx5_core_dev *dev, struct net_device *netdev)
+void mlx5_lag_remove(struct mlx5_core_dev *dev)
  {
         struct mlx5_lag *ldev;
         int i;
@@ -620,8 +616,7 @@ void mlx5_lag_remove(struct mlx5_core_dev *dev, struct net_device *netdev)
  
         if (i == MLX5_MAX_PORTS) {
                 if (ldev->nb.notifier_call)
-                       unregister_netdevice_notifier_dev_net(netdev, &ldev->nb,
-                                                             &ldev->nn);
+                       unregister_netdevice_notifier_net(&init_net, &ldev->nb);
                 mlx5_lag_mp_cleanup(ldev);
                 cancel_delayed_work_sync(&ldev->bond_work);
                 mlx5_lag_dev_free(ldev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag.h b/drivers/net/ethernet/mellanox/mlx5/core/lag.h

index 316ab09e26645a59b34a60e61a33db3185d3c2fb..f1068aac64067080234ddb89fdaf68631698f7fa 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag.h
@@ -44,7 +44,6 @@ struct mlx5_lag {
         struct workqueue_struct   *wq;
         struct delayed_work       bond_work;
         struct notifier_block     nb;
-       struct netdev_net_notifier      nn;
         struct lag_mp             lag_mp;
  };
  
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h

index fcce9e0fc82c8e3f2d240cfbaa1cfdd6f0a51f3f..da67b28d6e23e621bac2c014b26cf811274ed10f 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -157,7 +157,7 @@ int mlx5_query_qcam_reg(struct mlx5_core_dev *mdev, u32 *qcam,
                         u8 feature_group, u8 access_reg_group);
  
  void mlx5_lag_add(struct mlx5_core_dev *dev, struct net_device *netdev);
-void mlx5_lag_remove(struct mlx5_core_dev *dev, struct net_device *netdev);
+void mlx5_lag_remove(struct mlx5_core_dev *dev);
  
  int mlx5_irq_table_init(struct mlx5_core_dev *dev);
  void mlx5_irq_table_cleanup(struct mlx5_core_dev *dev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c

index c6c7d1defbd788ee657f45c8e3914c6e8393243e..aade62a9ee5ce93e59c9ffb139665d9eaf277b5b 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c
@@ -2307,7 +2307,9 @@ static int dr_ste_build_src_gvmi_qpn_tag(struct mlx5dr_match_param *value,
         struct mlx5dr_cmd_vport_cap *vport_cap;
         struct mlx5dr_domain *dmn = sb->dmn;
         struct mlx5dr_cmd_caps *caps;
+       u8 *bit_mask = sb->bit_mask;
         u8 *tag = hw_ste->tag;
+       bool source_gvmi_set;
  
         DR_STE_SET_TAG(src_gvmi_qp, tag, source_qp, misc, source_sqn);
  
@@ -2328,7 +2330,8 @@ static int dr_ste_build_src_gvmi_qpn_tag(struct mlx5dr_match_param *value,
         if (!vport_cap)
                 return -EINVAL;
  
-       if (vport_cap->vport_gvmi)
+       source_gvmi_set = MLX5_GET(ste_src_gvmi_qp, bit_mask, source_gvmi);
+       if (vport_cap->vport_gvmi && source_gvmi_set)
                 MLX5_SET(ste_src_gvmi_qp, tag, source_gvmi, vport_cap->vport_gvmi);
  
         misc->source_eswitch_owner_vhca_id = 0;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c

index 3abfc81259262d94a327b24b34d00b89d5409288..c2027192e21e8c8a298e915e161ea592d17b83f6 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.c
@@ -66,15 +66,20 @@ static int mlx5_cmd_dr_create_flow_table(struct mlx5_flow_root_namespace *ns,
                                          struct mlx5_flow_table *next_ft)
  {
         struct mlx5dr_table *tbl;
+       u32 flags;
         int err;
  
         if (mlx5_dr_is_fw_table(ft->flags))
                 return mlx5_fs_cmd_get_fw_cmds()->create_flow_table(ns, ft,
                                                                     log_size,
                                                                     next_ft);
+       flags = ft->flags;
+       /* turn off encap/decap if not supported for sw-str by fw */
+       if (!MLX5_CAP_FLOWTABLE(ns->dev, sw_owner_reformat_supported))
+               flags = ft->flags & ~(MLX5_FLOW_TABLE_TUNNEL_EN_REFORMAT |
+                                     MLX5_FLOW_TABLE_TUNNEL_EN_DECAP);
  
-       tbl = mlx5dr_table_create(ns->fs_dr_domain.dr_domain,
-                                 ft->level, ft->flags);
+       tbl = mlx5dr_table_create(ns->fs_dr_domain.dr_domain, ft->level, flags);
         if (!tbl) {
                 mlx5_core_err(ns->dev, "Failed creating dr flow_table\n");
                 return -EINVAL;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wq.c b/drivers/net/ethernet/mellanox/mlx5/core/wq.c

index 02f7e4a39578a33f04e030c32e1da47a322d09aa..01f075fac2765d2ad85651fbca2bf59195dcbfe1 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/wq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/wq.c
@@ -94,6 +94,13 @@ void mlx5_wq_cyc_wqe_dump(struct mlx5_wq_cyc *wq, u16 ix, u8 nstrides)
         print_hex_dump(KERN_WARNING, "", DUMP_PREFIX_OFFSET, 16, 1, wqe, len, false);
  }
  
+void mlx5_wq_cyc_reset(struct mlx5_wq_cyc *wq)
+{
+       wq->wqe_ctr = 0;
+       wq->cur_sz = 0;
+       mlx5_wq_cyc_update_db_record(wq);
+}
+
  int mlx5_wq_qp_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
                       void *qpc, struct mlx5_wq_qp *wq,
                       struct mlx5_wq_ctrl *wq_ctrl)
@@ -192,6 +199,19 @@ err_db_free:
         return err;
  }
  
+static void mlx5_wq_ll_init_list(struct mlx5_wq_ll *wq)
+{
+       struct mlx5_wqe_srq_next_seg *next_seg;
+       int i;
+
+       for (i = 0; i < wq->fbc.sz_m1; i++) {
+               next_seg = mlx5_wq_ll_get_wqe(wq, i);
+               next_seg->next_wqe_index = cpu_to_be16(i + 1);
+       }
+       next_seg = mlx5_wq_ll_get_wqe(wq, i);
+       wq->tail_next = &next_seg->next_wqe_index;
+}
+
  int mlx5_wq_ll_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
                       void *wqc, struct mlx5_wq_ll *wq,
                       struct mlx5_wq_ctrl *wq_ctrl)
@@ -199,9 +219,7 @@ int mlx5_wq_ll_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
         u8 log_wq_stride = MLX5_GET(wq, wqc, log_wq_stride);
         u8 log_wq_sz     = MLX5_GET(wq, wqc, log_wq_sz);
         struct mlx5_frag_buf_ctrl *fbc = &wq->fbc;
-       struct mlx5_wqe_srq_next_seg *next_seg;
         int err;
-       int i;
  
         err = mlx5_db_alloc_node(mdev, &wq_ctrl->db, param->db_numa_node);
         if (err) {
@@ -220,13 +238,7 @@ int mlx5_wq_ll_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
  
         mlx5_init_fbc(wq_ctrl->buf.frags, log_wq_stride, log_wq_sz, fbc);
  
-       for (i = 0; i < fbc->sz_m1; i++) {
-               next_seg = mlx5_wq_ll_get_wqe(wq, i);
-               next_seg->next_wqe_index = cpu_to_be16(i + 1);
-       }
-       next_seg = mlx5_wq_ll_get_wqe(wq, i);
-       wq->tail_next = &next_seg->next_wqe_index;
-
+       mlx5_wq_ll_init_list(wq);
         wq_ctrl->mdev = mdev;
  
         return 0;
@@ -237,6 +249,15 @@ err_db_free:
         return err;
  }
  
+void mlx5_wq_ll_reset(struct mlx5_wq_ll *wq)
+{
+       wq->head = 0;
+       wq->wqe_ctr = 0;
+       wq->cur_sz = 0;
+       mlx5_wq_ll_init_list(wq);
+       mlx5_wq_ll_update_db_record(wq);
+}
+
  void mlx5_wq_destroy(struct mlx5_wq_ctrl *wq_ctrl)
  {
         mlx5_frag_buf_free(wq_ctrl->mdev, &wq_ctrl->buf);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wq.h b/drivers/net/ethernet/mellanox/mlx5/core/wq.h

index d9a94bc223c05d470ba1cc58a1bf2a04086e1e27..4cadc336593f1cf76321ff6f85e940ea35297a10 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlx5/core/wq.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/wq.h
@@ -80,6 +80,7 @@ int mlx5_wq_cyc_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
                        void *wqc, struct mlx5_wq_cyc *wq,
                        struct mlx5_wq_ctrl *wq_ctrl);
  void mlx5_wq_cyc_wqe_dump(struct mlx5_wq_cyc *wq, u16 ix, u8 nstrides);
+void mlx5_wq_cyc_reset(struct mlx5_wq_cyc *wq);
  
  int mlx5_wq_qp_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
                       void *qpc, struct mlx5_wq_qp *wq,
@@ -92,6 +93,7 @@ int mlx5_cqwq_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
  int mlx5_wq_ll_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
                       void *wqc, struct mlx5_wq_ll *wq,
                       struct mlx5_wq_ctrl *wq_ctrl);
+void mlx5_wq_ll_reset(struct mlx5_wq_ll *wq);
  
  void mlx5_wq_destroy(struct mlx5_wq_ctrl *wq_ctrl);
  
diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h b/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h

index e0d7d2d9a0c81c8df137a6b01ea17c0f9066e169..43fa8c85b5d9f85d0892adbfdf28d482bb4c7e39 100644 (file)
--- a/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h
@@ -28,7 +28,7 @@
  #define MLXSW_PCI_SW_RESET                     0xF0010
  #define MLXSW_PCI_SW_RESET_RST_BIT             BIT(0)
  #define MLXSW_PCI_SW_RESET_TIMEOUT_MSECS       900000
-#define MLXSW_PCI_SW_RESET_WAIT_MSECS          100
+#define MLXSW_PCI_SW_RESET_WAIT_MSECS          200
  #define MLXSW_PCI_FW_READY                     0xA1844
  #define MLXSW_PCI_FW_READY_MASK                        0xFFFF
  #define MLXSW_PCI_FW_READY_MAGIC               0x5E
diff --git a/drivers/net/ethernet/micrel/ks8851_mll.c b/drivers/net/ethernet/micrel/ks8851_mll.c

index a41a90c589db2d2e85863e6da3850307d5e2a488..58579baf3f7a007f65910938b7e3a7f3ddaed718 100644 (file)
--- a/drivers/net/ethernet/micrel/ks8851_mll.c
+++ b/drivers/net/ethernet/micrel/ks8851_mll.c
@@ -156,24 +156,6 @@ static int msg_enable;
   * chip is busy transferring packet data (RX/TX FIFO accesses).
   */
  
-/**
- * ks_rdreg8 - read 8 bit register from device
- * @ks   : The chip information
- * @offset: The register address
- *
- * Read a 8bit register from the chip, returning the result
- */
-static u8 ks_rdreg8(struct ks_net *ks, int offset)
-{
-       u16 data;
-       u8 shift_bit = offset & 0x03;
-       u8 shift_data = (offset & 1) << 3;
-       ks->cmd_reg_cache = (u16) offset | (u16)(BE0 << shift_bit);
-       iowrite16(ks->cmd_reg_cache, ks->hw_addr_cmd);
-       data  = ioread16(ks->hw_addr);
-       return (u8)(data >> shift_data);
-}
-
  /**
   * ks_rdreg16 - read 16 bit register from device
   * @ks   : The chip information
@@ -184,27 +166,11 @@ static u8 ks_rdreg8(struct ks_net *ks, int offset)
  
  static u16 ks_rdreg16(struct ks_net *ks, int offset)
  {
-       ks->cmd_reg_cache = (u16)offset | ((BE1 | BE0) << (offset & 0x02));
+       ks->cmd_reg_cache = (u16)offset | ((BE3 | BE2) >> (offset & 0x02));
         iowrite16(ks->cmd_reg_cache, ks->hw_addr_cmd);
         return ioread16(ks->hw_addr);
  }
  
-/**
- * ks_wrreg8 - write 8bit register value to chip
- * @ks: The chip information
- * @offset: The register address
- * @value: The value to write
- *
- */
-static void ks_wrreg8(struct ks_net *ks, int offset, u8 value)
-{
-       u8  shift_bit = (offset & 0x03);
-       u16 value_write = (u16)(value << ((offset & 1) << 3));
-       ks->cmd_reg_cache = (u16)offset | (BE0 << shift_bit);
-       iowrite16(ks->cmd_reg_cache, ks->hw_addr_cmd);
-       iowrite16(value_write, ks->hw_addr);
-}
-
  /**
   * ks_wrreg16 - write 16bit register value to chip
   * @ks: The chip information
@@ -215,7 +181,7 @@ static void ks_wrreg8(struct ks_net *ks, int offset, u8 value)
  
  static void ks_wrreg16(struct ks_net *ks, int offset, u16 value)
  {
-       ks->cmd_reg_cache = (u16)offset | ((BE1 | BE0) << (offset & 0x02));
+       ks->cmd_reg_cache = (u16)offset | ((BE3 | BE2) >> (offset & 0x02));
         iowrite16(ks->cmd_reg_cache, ks->hw_addr_cmd);
         iowrite16(value, ks->hw_addr);
  }
@@ -231,7 +197,7 @@ static inline void ks_inblk(struct ks_net *ks, u16 *wptr, u32 len)
  {
         len >>= 1;
         while (len--)
-               *wptr++ = (u16)ioread16(ks->hw_addr);
+               *wptr++ = be16_to_cpu(ioread16(ks->hw_addr));
  }
  
  /**
@@ -245,7 +211,7 @@ static inline void ks_outblk(struct ks_net *ks, u16 *wptr, u32 len)
  {
         len >>= 1;
         while (len--)
-               iowrite16(*wptr++, ks->hw_addr);
+               iowrite16(cpu_to_be16(*wptr++), ks->hw_addr);
  }
  
  static void ks_disable_int(struct ks_net *ks)
@@ -324,8 +290,7 @@ static void ks_read_config(struct ks_net *ks)
         u16 reg_data = 0;
  
         /* Regardless of bus width, 8 bit read should always work.*/
-       reg_data = ks_rdreg8(ks, KS_CCR) & 0x00FF;
-       reg_data |= ks_rdreg8(ks, KS_CCR+1) << 8;
+       reg_data = ks_rdreg16(ks, KS_CCR);
  
         /* addr/data bus are multiplexed */
         ks->sharedbus = (reg_data & CCR_SHARED) == CCR_SHARED;
@@ -429,7 +394,7 @@ static inline void ks_read_qmu(struct ks_net *ks, u16 *buf, u32 len)
  
         /* 1. set sudo DMA mode */
         ks_wrreg16(ks, KS_RXFDPR, RXFDPR_RXFPAI);
-       ks_wrreg8(ks, KS_RXQCR, (ks->rc_rxqcr | RXQCR_SDA) & 0xff);
+       ks_wrreg16(ks, KS_RXQCR, ks->rc_rxqcr | RXQCR_SDA);
  
         /* 2. read prepend data */
         /**
@@ -446,7 +411,7 @@ static inline void ks_read_qmu(struct ks_net *ks, u16 *buf, u32 len)
         ks_inblk(ks, buf, ALIGN(len, 4));
  
         /* 4. reset sudo DMA Mode */
-       ks_wrreg8(ks, KS_RXQCR, ks->rc_rxqcr);
+       ks_wrreg16(ks, KS_RXQCR, ks->rc_rxqcr);
  }
  
  /**
@@ -548,14 +513,17 @@ static irqreturn_t ks_irq(int irq, void *pw)
  {
         struct net_device *netdev = pw;
         struct ks_net *ks = netdev_priv(netdev);
+       unsigned long flags;
         u16 status;
  
+       spin_lock_irqsave(&ks->statelock, flags);
         /*this should be the first in IRQ handler */
         ks_save_cmd_reg(ks);
  
         status = ks_rdreg16(ks, KS_ISR);
         if (unlikely(!status)) {
                 ks_restore_cmd_reg(ks);
+               spin_unlock_irqrestore(&ks->statelock, flags);
                 return IRQ_NONE;
         }
  
@@ -581,6 +549,7 @@ static irqreturn_t ks_irq(int irq, void *pw)
                 ks->netdev->stats.rx_over_errors++;
         /* this should be the last in IRQ handler*/
         ks_restore_cmd_reg(ks);
+       spin_unlock_irqrestore(&ks->statelock, flags);
         return IRQ_HANDLED;
  }
  
@@ -650,6 +619,7 @@ static int ks_net_stop(struct net_device *netdev)
  
         /* shutdown RX/TX QMU */
         ks_disable_qmu(ks);
+       ks_disable_int(ks);
  
         /* set powermode to soft power down to save power */
         ks_set_powermode(ks, PMECR_PM_SOFTDOWN);
@@ -679,13 +649,13 @@ static void ks_write_qmu(struct ks_net *ks, u8 *pdata, u16 len)
         ks->txh.txw[1] = cpu_to_le16(len);
  
         /* 1. set sudo-DMA mode */
-       ks_wrreg8(ks, KS_RXQCR, (ks->rc_rxqcr | RXQCR_SDA) & 0xff);
+       ks_wrreg16(ks, KS_RXQCR, ks->rc_rxqcr | RXQCR_SDA);
         /* 2. write status/lenth info */
         ks_outblk(ks, ks->txh.txw, 4);
         /* 3. write pkt data */
         ks_outblk(ks, (u16 *)pdata, ALIGN(len, 4));
         /* 4. reset sudo-DMA mode */
-       ks_wrreg8(ks, KS_RXQCR, ks->rc_rxqcr);
+       ks_wrreg16(ks, KS_RXQCR, ks->rc_rxqcr);
         /* 5. Enqueue Tx(move the pkt from TX buffer into TXQ) */
         ks_wrreg16(ks, KS_TXQCR, TXQCR_METFE);
         /* 6. wait until TXQCR_METFE is auto-cleared */
@@ -706,10 +676,9 @@ static netdev_tx_t ks_start_xmit(struct sk_buff *skb, struct net_device *netdev)
  {
         netdev_tx_t retv = NETDEV_TX_OK;
         struct ks_net *ks = netdev_priv(netdev);
+       unsigned long flags;
  
-       disable_irq(netdev->irq);
-       ks_disable_int(ks);
-       spin_lock(&ks->statelock);
+       spin_lock_irqsave(&ks->statelock, flags);
  
         /* Extra space are required:
         *  4 byte for alignment, 4 for status/length, 4 for CRC
@@ -723,9 +692,7 @@ static netdev_tx_t ks_start_xmit(struct sk_buff *skb, struct net_device *netdev)
                 dev_kfree_skb(skb);
         } else
                 retv = NETDEV_TX_BUSY;
-       spin_unlock(&ks->statelock);
-       ks_enable_int(ks);
-       enable_irq(netdev->irq);
+       spin_unlock_irqrestore(&ks->statelock, flags);
         return retv;
  }
  
diff --git a/drivers/net/ethernet/mscc/ocelot_board.c b/drivers/net/ethernet/mscc/ocelot_board.c

index b38820849faab9d23bcd7f06e2a0637819c96647..1135a18019c77faf4ff7c42a90de630644d8364a 100644 (file)
--- a/drivers/net/ethernet/mscc/ocelot_board.c
+++ b/drivers/net/ethernet/mscc/ocelot_board.c
@@ -114,6 +114,14 @@ static irqreturn_t ocelot_xtr_irq_handler(int irq, void *arg)
                 if (err != 4)
                         break;
  
+               /* At this point the IFH was read correctly, so it is safe to
+                * presume that there is no error. The err needs to be reset
+                * otherwise a frame could come in CPU queue between the while
+                * condition and the check for error later on. And in that case
+                * the new frame is just removed and not processed.
+                */
+               err = 0;
+
                 ocelot_parse_ifh(ifh, &info);
  
                 ocelot_port = ocelot->ports[info.port];
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_dev.c b/drivers/net/ethernet/pensando/ionic/ionic_dev.c

index 87f82f36812ffaa52d5dae0d191581b61c4fda15..46107de5e6c3bfb0fcc526ef46f642595c0b7304 100644 (file)
--- a/drivers/net/ethernet/pensando/ionic/ionic_dev.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_dev.c
@@ -103,7 +103,7 @@ int ionic_heartbeat_check(struct ionic *ionic)
  {
         struct ionic_dev *idev = &ionic->idev;
         unsigned long hb_time;
-       u32 fw_status;
+       u8 fw_status;
         u32 hb;
  
         /* wait a little more than one second before testing again */
@@ -111,9 +111,12 @@ int ionic_heartbeat_check(struct ionic *ionic)
         if (time_before(hb_time, (idev->last_hb_time + ionic->watchdog_period)))
                 return 0;
  
-       /* firmware is useful only if fw_status is non-zero */
-       fw_status = ioread32(&idev->dev_info_regs->fw_status);
-       if (!fw_status)
+       /* firmware is useful only if the running bit is set and
+        * fw_status != 0xff (bad PCI read)
+        */
+       fw_status = ioread8(&idev->dev_info_regs->fw_status);
+       if (fw_status == 0xff ||
+           !(fw_status & IONIC_FW_STS_F_RUNNING))
                 return -ENXIO;
  
         /* early FW has no heartbeat, else FW will return non-zero */
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_if.h b/drivers/net/ethernet/pensando/ionic/ionic_if.h

index ce07c2931a727a4e2571803f9d986dafe6c2e0b2..54547d53b0f22c62feda507af8a0fb1a14664f85 100644 (file)
--- a/drivers/net/ethernet/pensando/ionic/ionic_if.h
+++ b/drivers/net/ethernet/pensando/ionic/ionic_if.h
@@ -2445,6 +2445,7 @@ union ionic_dev_info_regs {
                 u8     version;
                 u8     asic_type;
                 u8     asic_rev;
+#define IONIC_FW_STS_F_RUNNING 0x1
                 u8     fw_status;
                 u32    fw_heartbeat;
                 char   fw_version[IONIC_DEVINFO_FWVERS_BUFLEN];
diff --git a/drivers/net/ethernet/qlogic/qede/qede.h b/drivers/net/ethernet/qlogic/qede/qede.h

index e8a1b27db84debab0aad020a869660bc310b2b80..234c6f30effb7ff12ab6d0ea2cbd3e08b4610dec 100644 (file)
--- a/drivers/net/ethernet/qlogic/qede/qede.h
+++ b/drivers/net/ethernet/qlogic/qede/qede.h
@@ -163,6 +163,8 @@ struct qede_rdma_dev {
         struct list_head entry;
         struct list_head rdma_event_list;
         struct workqueue_struct *rdma_wq;
+       struct kref refcnt;
+       struct completion event_comp;
         bool exp_recovery;
  };
  
diff --git a/drivers/net/ethernet/qlogic/qede/qede_rdma.c b/drivers/net/ethernet/qlogic/qede/qede_rdma.c

index ffabc2d2f082444e85a546db4a5b10a460a59c16..2d873ae8a234d2520a79febe99d740e3e4b95913 100644 (file)
--- a/drivers/net/ethernet/qlogic/qede/qede_rdma.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_rdma.c
@@ -59,6 +59,9 @@ static void _qede_rdma_dev_add(struct qede_dev *edev)
  static int qede_rdma_create_wq(struct qede_dev *edev)
  {
         INIT_LIST_HEAD(&edev->rdma_info.rdma_event_list);
+       kref_init(&edev->rdma_info.refcnt);
+       init_completion(&edev->rdma_info.event_comp);
+
         edev->rdma_info.rdma_wq = create_singlethread_workqueue("rdma_wq");
         if (!edev->rdma_info.rdma_wq) {
                 DP_NOTICE(edev, "qedr: Could not create workqueue\n");
@@ -83,8 +86,23 @@ static void qede_rdma_cleanup_event(struct qede_dev *edev)
         }
  }
  
+static void qede_rdma_complete_event(struct kref *ref)
+{
+       struct qede_rdma_dev *rdma_dev =
+               container_of(ref, struct qede_rdma_dev, refcnt);
+
+       /* no more events will be added after this */
+       complete(&rdma_dev->event_comp);
+}
+
  static void qede_rdma_destroy_wq(struct qede_dev *edev)
  {
+       /* Avoid race with add_event flow, make sure it finishes before
+        * we start accessing the list and cleaning up the work
+        */
+       kref_put(&edev->rdma_info.refcnt, qede_rdma_complete_event);
+       wait_for_completion(&edev->rdma_info.event_comp);
+
         qede_rdma_cleanup_event(edev);
         destroy_workqueue(edev->rdma_info.rdma_wq);
  }
@@ -310,15 +328,24 @@ static void qede_rdma_add_event(struct qede_dev *edev,
         if (!edev->rdma_info.qedr_dev)
                 return;
  
+       /* We don't want the cleanup flow to start while we're allocating and
+        * scheduling the work
+        */
+       if (!kref_get_unless_zero(&edev->rdma_info.refcnt))
+               return; /* already being destroyed */
+
         event_node = qede_rdma_get_free_event_node(edev);
         if (!event_node)
-               return;
+               goto out;
  
         event_node->event = event;
         event_node->ptr = edev;
  
         INIT_WORK(&event_node->work, qede_rdma_handle_event);
         queue_work(edev->rdma_info.rdma_wq, &event_node->work);
+
+out:
+       kref_put(&edev->rdma_info.refcnt, qede_rdma_complete_event);
  }
  
  void qede_rdma_dev_event_open(struct qede_dev *edev)
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c

index 06de59521fc4a5e0d8861672525ab39406f98f18..fbf4cbcf1a6544b7fca722721bb94f1579493447 100644 (file)
--- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.c
@@ -13,25 +13,6 @@
  #include "rmnet_vnd.h"
  #include "rmnet_private.h"
  
-/* Locking scheme -
- * The shared resource which needs to be protected is realdev->rx_handler_data.
- * For the writer path, this is using rtnl_lock(). The writer paths are
- * rmnet_newlink(), rmnet_dellink() and rmnet_force_unassociate_device(). These
- * paths are already called with rtnl_lock() acquired in. There is also an
- * ASSERT_RTNL() to ensure that we are calling with rtnl acquired. For
- * dereference here, we will need to use rtnl_dereference(). Dev list writing
- * needs to happen with rtnl_lock() acquired for netdev_master_upper_dev_link().
- * For the reader path, the real_dev->rx_handler_data is called in the TX / RX
- * path. We only need rcu_read_lock() for these scenarios. In these cases,
- * the rcu_read_lock() is held in __dev_queue_xmit() and
- * netif_receive_skb_internal(), so readers need to use rcu_dereference_rtnl()
- * to get the relevant information. For dev list reading, we again acquire
- * rcu_read_lock() in rmnet_dellink() for netdev_master_upper_dev_get_rcu().
- * We also use unregister_netdevice_many() to free all rmnet devices in
- * rmnet_force_unassociate_device() so we dont lose the rtnl_lock() and free in
- * same context.
- */
-
  /* Local Definitions and Declarations */
  
  static const struct nla_policy rmnet_policy[IFLA_RMNET_MAX + 1] = {
@@ -51,9 +32,10 @@ rmnet_get_port_rtnl(const struct net_device *real_dev)
         return rtnl_dereference(real_dev->rx_handler_data);
  }
  
-static int rmnet_unregister_real_device(struct net_device *real_dev,
-                                       struct rmnet_port *port)
+static int rmnet_unregister_real_device(struct net_device *real_dev)
  {
+       struct rmnet_port *port = rmnet_get_port_rtnl(real_dev);
+
         if (port->nr_rmnet_devs)
                 return -EINVAL;
  
@@ -61,9 +43,6 @@ static int rmnet_unregister_real_device(struct net_device *real_dev,
  
         kfree(port);
  
-       /* release reference on real_dev */
-       dev_put(real_dev);
-
         netdev_dbg(real_dev, "Removed from rmnet\n");
         return 0;
  }
@@ -89,9 +68,6 @@ static int rmnet_register_real_device(struct net_device *real_dev)
                 return -EBUSY;
         }
  
-       /* hold on to real dev for MAP data */
-       dev_hold(real_dev);
-
         for (entry = 0; entry < RMNET_MAX_LOGICAL_EP; entry++)
                 INIT_HLIST_HEAD(&port->muxed_ep[entry]);
  
@@ -99,28 +75,33 @@ static int rmnet_register_real_device(struct net_device *real_dev)
         return 0;
  }
  
-static void rmnet_unregister_bridge(struct net_device *dev,
-                                   struct rmnet_port *port)
+static void rmnet_unregister_bridge(struct rmnet_port *port)
  {
-       struct rmnet_port *bridge_port;
-       struct net_device *bridge_dev;
+       struct net_device *bridge_dev, *real_dev, *rmnet_dev;
+       struct rmnet_port *real_port;
  
         if (port->rmnet_mode != RMNET_EPMODE_BRIDGE)
                 return;
  
-       /* bridge slave handling */
+       rmnet_dev = port->rmnet_dev;
         if (!port->nr_rmnet_devs) {
-               bridge_dev = port->bridge_ep;
+               /* bridge device */
+               real_dev = port->bridge_ep;
+               bridge_dev = port->dev;
  
-               bridge_port = rmnet_get_port_rtnl(bridge_dev);
-               bridge_port->bridge_ep = NULL;
-               bridge_port->rmnet_mode = RMNET_EPMODE_VND;
+               real_port = rmnet_get_port_rtnl(real_dev);
+               real_port->bridge_ep = NULL;
+               real_port->rmnet_mode = RMNET_EPMODE_VND;
         } else {
+               /* real device */
                 bridge_dev = port->bridge_ep;
  
-               bridge_port = rmnet_get_port_rtnl(bridge_dev);
-               rmnet_unregister_real_device(bridge_dev, bridge_port);
+               port->bridge_ep = NULL;
+               port->rmnet_mode = RMNET_EPMODE_VND;
         }
+
+       netdev_upper_dev_unlink(bridge_dev, rmnet_dev);
+       rmnet_unregister_real_device(bridge_dev);
  }
  
  static int rmnet_newlink(struct net *src_net, struct net_device *dev,
@@ -135,6 +116,11 @@ static int rmnet_newlink(struct net *src_net, struct net_device *dev,
         int err = 0;
         u16 mux_id;
  
+       if (!tb[IFLA_LINK]) {
+               NL_SET_ERR_MSG_MOD(extack, "link not specified");
+               return -EINVAL;
+       }
+
         real_dev = __dev_get_by_index(src_net, nla_get_u32(tb[IFLA_LINK]));
         if (!real_dev || !dev)
                 return -ENODEV;
@@ -157,7 +143,12 @@ static int rmnet_newlink(struct net *src_net, struct net_device *dev,
         if (err)
                 goto err1;
  
+       err = netdev_upper_dev_link(real_dev, dev, extack);
+       if (err < 0)
+               goto err2;
+
         port->rmnet_mode = mode;
+       port->rmnet_dev = dev;
  
         hlist_add_head_rcu(&ep->hlnode, &port->muxed_ep[mux_id]);
  
@@ -173,8 +164,11 @@ static int rmnet_newlink(struct net *src_net, struct net_device *dev,
  
         return 0;
  
+err2:
+       unregister_netdevice(dev);
+       rmnet_vnd_dellink(mux_id, port, ep);
  err1:
-       rmnet_unregister_real_device(real_dev, port);
+       rmnet_unregister_real_device(real_dev);
  err0:
         kfree(ep);
         return err;
@@ -183,77 +177,74 @@ err0:
  static void rmnet_dellink(struct net_device *dev, struct list_head *head)
  {
         struct rmnet_priv *priv = netdev_priv(dev);
-       struct net_device *real_dev;
+       struct net_device *real_dev, *bridge_dev;
+       struct rmnet_port *real_port, *bridge_port;
         struct rmnet_endpoint *ep;
-       struct rmnet_port *port;
-       u8 mux_id;
+       u8 mux_id = priv->mux_id;
  
         real_dev = priv->real_dev;
  
-       if (!real_dev || !rmnet_is_real_dev_registered(real_dev))
+       if (!rmnet_is_real_dev_registered(real_dev))
                 return;
  
-       port = rmnet_get_port_rtnl(real_dev);
-
-       mux_id = rmnet_vnd_get_mux(dev);
+       real_port = rmnet_get_port_rtnl(real_dev);
+       bridge_dev = real_port->bridge_ep;
+       if (bridge_dev) {
+               bridge_port = rmnet_get_port_rtnl(bridge_dev);
+               rmnet_unregister_bridge(bridge_port);
+       }
  
-       ep = rmnet_get_endpoint(port, mux_id);
+       ep = rmnet_get_endpoint(real_port, mux_id);
         if (ep) {
                 hlist_del_init_rcu(&ep->hlnode);
-               rmnet_unregister_bridge(dev, port);
-               rmnet_vnd_dellink(mux_id, port, ep);
+               rmnet_vnd_dellink(mux_id, real_port, ep);
                 kfree(ep);
         }
-       rmnet_unregister_real_device(real_dev, port);
  
+       netdev_upper_dev_unlink(real_dev, dev);
+       rmnet_unregister_real_device(real_dev);
         unregister_netdevice_queue(dev, head);
  }
  
-static void rmnet_force_unassociate_device(struct net_device *dev)
+static void rmnet_force_unassociate_device(struct net_device *real_dev)
  {
-       struct net_device *real_dev = dev;
         struct hlist_node *tmp_ep;
         struct rmnet_endpoint *ep;
         struct rmnet_port *port;
         unsigned long bkt_ep;
         LIST_HEAD(list);
  
-       if (!rmnet_is_real_dev_registered(real_dev))
-               return;
-
-       ASSERT_RTNL();
-
-       port = rmnet_get_port_rtnl(dev);
-
-       rcu_read_lock();
-       rmnet_unregister_bridge(dev, port);
-
-       hash_for_each_safe(port->muxed_ep, bkt_ep, tmp_ep, ep, hlnode) {
-               unregister_netdevice_queue(ep->egress_dev, &list);
-               rmnet_vnd_dellink(ep->mux_id, port, ep);
+       port = rmnet_get_port_rtnl(real_dev);
  
-               hlist_del_init_rcu(&ep->hlnode);
-               kfree(ep);
+       if (port->nr_rmnet_devs) {
+               /* real device */
+               rmnet_unregister_bridge(port);
+               hash_for_each_safe(port->muxed_ep, bkt_ep, tmp_ep, ep, hlnode) {
+                       unregister_netdevice_queue(ep->egress_dev, &list);
+                       netdev_upper_dev_unlink(real_dev, ep->egress_dev);
+                       rmnet_vnd_dellink(ep->mux_id, port, ep);
+                       hlist_del_init_rcu(&ep->hlnode);
+                       kfree(ep);
+               }
+               rmnet_unregister_real_device(real_dev);
+               unregister_netdevice_many(&list);
+       } else {
+               rmnet_unregister_bridge(port);
         }
-
-       rcu_read_unlock();
-       unregister_netdevice_many(&list);
-
-       rmnet_unregister_real_device(real_dev, port);
  }
  
  static int rmnet_config_notify_cb(struct notifier_block *nb,
                                   unsigned long event, void *data)
  {
-       struct net_device *dev = netdev_notifier_info_to_dev(data);
+       struct net_device *real_dev = netdev_notifier_info_to_dev(data);
  
-       if (!dev)
+       if (!rmnet_is_real_dev_registered(real_dev))
                 return NOTIFY_DONE;
  
         switch (event) {
         case NETDEV_UNREGISTER:
-               netdev_dbg(dev, "Kernel unregister\n");
-               rmnet_force_unassociate_device(dev);
+               netdev_dbg(real_dev, "Kernel unregister\n");
+               rmnet_force_unassociate_device(real_dev);
                 break;
  
         default:
@@ -295,16 +286,18 @@ static int rmnet_changelink(struct net_device *dev, struct nlattr *tb[],
         if (!dev)
                 return -ENODEV;
  
-       real_dev = __dev_get_by_index(dev_net(dev),
-                                     nla_get_u32(tb[IFLA_LINK]));
-
-       if (!real_dev || !rmnet_is_real_dev_registered(real_dev))
+       real_dev = priv->real_dev;
+       if (!rmnet_is_real_dev_registered(real_dev))
                 return -ENODEV;
  
         port = rmnet_get_port_rtnl(real_dev);
  
         if (data[IFLA_RMNET_MUX_ID]) {
                 mux_id = nla_get_u16(data[IFLA_RMNET_MUX_ID]);
+               if (rmnet_get_endpoint(port, mux_id)) {
+                       NL_SET_ERR_MSG_MOD(extack, "MUX ID already exists");
+                       return -EINVAL;
+               }
                 ep = rmnet_get_endpoint(port, priv->mux_id);
                 if (!ep)
                         return -ENODEV;
@@ -379,11 +372,10 @@ struct rtnl_link_ops rmnet_link_ops __read_mostly = {
         .fill_info      = rmnet_fill_info,
  };
  
-/* Needs either rcu_read_lock() or rtnl lock */
-struct rmnet_port *rmnet_get_port(struct net_device *real_dev)
+struct rmnet_port *rmnet_get_port_rcu(struct net_device *real_dev)
  {
         if (rmnet_is_real_dev_registered(real_dev))
-               return rcu_dereference_rtnl(real_dev->rx_handler_data);
+               return rcu_dereference_bh(real_dev->rx_handler_data);
         else
                 return NULL;
  }
@@ -409,7 +401,7 @@ int rmnet_add_bridge(struct net_device *rmnet_dev,
         struct rmnet_port *port, *slave_port;
         int err;
  
-       port = rmnet_get_port(real_dev);
+       port = rmnet_get_port_rtnl(real_dev);
  
         /* If there is more than one rmnet dev attached, its probably being
          * used for muxing. Skip the briding in that case
@@ -417,6 +409,9 @@ int rmnet_add_bridge(struct net_device *rmnet_dev,
         if (port->nr_rmnet_devs > 1)
                 return -EINVAL;
  
+       if (port->rmnet_mode != RMNET_EPMODE_VND)
+               return -EINVAL;
+
         if (rmnet_is_real_dev_registered(slave_dev))
                 return -EBUSY;
  
@@ -424,9 +419,17 @@ int rmnet_add_bridge(struct net_device *rmnet_dev,
         if (err)
                 return -EBUSY;
  
-       slave_port = rmnet_get_port(slave_dev);
+       err = netdev_master_upper_dev_link(slave_dev, rmnet_dev, NULL, NULL,
+                                          extack);
+       if (err) {
+               rmnet_unregister_real_device(slave_dev);
+               return err;
+       }
+
+       slave_port = rmnet_get_port_rtnl(slave_dev);
         slave_port->rmnet_mode = RMNET_EPMODE_BRIDGE;
         slave_port->bridge_ep = real_dev;
+       slave_port->rmnet_dev = rmnet_dev;
  
         port->rmnet_mode = RMNET_EPMODE_BRIDGE;
         port->bridge_ep = slave_dev;
@@ -438,16 +441,9 @@ int rmnet_add_bridge(struct net_device *rmnet_dev,
  int rmnet_del_bridge(struct net_device *rmnet_dev,
                      struct net_device *slave_dev)
  {
-       struct rmnet_priv *priv = netdev_priv(rmnet_dev);
-       struct net_device *real_dev = priv->real_dev;
-       struct rmnet_port *port, *slave_port;
+       struct rmnet_port *port = rmnet_get_port_rtnl(slave_dev);
  
-       port = rmnet_get_port(real_dev);
-       port->rmnet_mode = RMNET_EPMODE_VND;
-       port->bridge_ep = NULL;
-
-       slave_port = rmnet_get_port(slave_dev);
-       rmnet_unregister_real_device(slave_dev, slave_port);
+       rmnet_unregister_bridge(port);
  
         netdev_dbg(slave_dev, "removed from rmnet as slave\n");
         return 0;
@@ -473,8 +469,8 @@ static int __init rmnet_init(void)
  
  static void __exit rmnet_exit(void)
  {
-       unregister_netdevice_notifier(&rmnet_dev_notifier);
         rtnl_link_unregister(&rmnet_link_ops);
+       unregister_netdevice_notifier(&rmnet_dev_notifier);
  }
  
  module_init(rmnet_init)
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h

index cd0a6bcbe74ade8247a428345615240faa6d9169..be515982d6286e8d9ca676e8a316b1f8e4606765 100644 (file)
--- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_config.h
@@ -28,6 +28,7 @@ struct rmnet_port {
         u8 rmnet_mode;
         struct hlist_head muxed_ep[RMNET_MAX_LOGICAL_EP];
         struct net_device *bridge_ep;
+       struct net_device *rmnet_dev;
  };
  
  extern struct rtnl_link_ops rmnet_link_ops;
@@ -65,7 +66,7 @@ struct rmnet_priv {
         struct rmnet_priv_stats stats;
  };
  
-struct rmnet_port *rmnet_get_port(struct net_device *real_dev);
+struct rmnet_port *rmnet_get_port_rcu(struct net_device *real_dev);
  struct rmnet_endpoint *rmnet_get_endpoint(struct rmnet_port *port, u8 mux_id);
  int rmnet_add_bridge(struct net_device *rmnet_dev,
                      struct net_device *slave_dev,
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c

index 1b74bc16040274f21432aba798898e688e716eb5..29a7bfa2584dc95b05fbcc7de61a5c912739b435 100644 (file)
--- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_handlers.c
@@ -159,6 +159,9 @@ static int rmnet_map_egress_handler(struct sk_buff *skb,
  static void
  rmnet_bridge_handler(struct sk_buff *skb, struct net_device *bridge_dev)
  {
+       if (skb_mac_header_was_set(skb))
+               skb_push(skb, skb->mac_len);
+
         if (bridge_dev) {
                 skb->dev = bridge_dev;
                 dev_queue_xmit(skb);
@@ -184,7 +187,7 @@ rx_handler_result_t rmnet_rx_handler(struct sk_buff **pskb)
                 return RX_HANDLER_PASS;
  
         dev = skb->dev;
-       port = rmnet_get_port(dev);
+       port = rmnet_get_port_rcu(dev);
  
         switch (port->rmnet_mode) {
         case RMNET_EPMODE_VND:
@@ -217,7 +220,7 @@ void rmnet_egress_handler(struct sk_buff *skb)
         skb->dev = priv->real_dev;
         mux_id = priv->mux_id;
  
-       port = rmnet_get_port(skb->dev);
+       port = rmnet_get_port_rcu(skb->dev);
         if (!port)
                 goto drop;
  
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c

index 509dfc895a33ee2d0d5395b18d69bd998468dc1a..26ad40f19c64caae7476c58fb24cf7d3efe4cd13 100644 (file)
--- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c
@@ -266,14 +266,6 @@ int rmnet_vnd_dellink(u8 id, struct rmnet_port *port,
         return 0;
  }
  
-u8 rmnet_vnd_get_mux(struct net_device *rmnet_dev)
-{
-       struct rmnet_priv *priv;
-
-       priv = netdev_priv(rmnet_dev);
-       return priv->mux_id;
-}
-
  int rmnet_vnd_do_flow_control(struct net_device *rmnet_dev, int enable)
  {
         netdev_dbg(rmnet_dev, "Setting VND TX queue state to %d\n", enable);
diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h

index 54cbaf3c3bc4309347b5966548a3f4c49d5644f2..14d77c709d4adf63e93ba3220d2818873b2d6bab 100644 (file)
--- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.h
@@ -16,6 +16,5 @@ int rmnet_vnd_dellink(u8 id, struct rmnet_port *port,
                       struct rmnet_endpoint *ep);
  void rmnet_vnd_rx_fixup(struct sk_buff *skb, struct net_device *dev);
  void rmnet_vnd_tx_fixup(struct sk_buff *skb, struct net_device *dev);
-u8 rmnet_vnd_get_mux(struct net_device *rmnet_dev);
  void rmnet_vnd_setup(struct net_device *dev);
  #endif /* _RMNET_VND_H_ */
diff --git a/drivers/net/ethernet/sfc/ptp.c b/drivers/net/ethernet/sfc/ptp.c

index af15a737c675611919a65d5394447900dbac3f26..59b4f16896a81e561ce6df586c255b6a8639bf4e 100644 (file)
--- a/drivers/net/ethernet/sfc/ptp.c
+++ b/drivers/net/ethernet/sfc/ptp.c
@@ -560,13 +560,45 @@ efx_ptp_mac_nic_to_ktime_correction(struct efx_nic *efx,
                                     u32 nic_major, u32 nic_minor,
                                     s32 correction)
  {
+       u32 sync_timestamp;
         ktime_t kt = { 0 };
+       s16 delta;
  
         if (!(nic_major & 0x80000000)) {
                 WARN_ON_ONCE(nic_major >> 16);
-               /* Use the top bits from the latest sync event. */
-               nic_major &= 0xffff;
-               nic_major |= (last_sync_timestamp_major(efx) & 0xffff0000);
+
+               /* Medford provides 48 bits of timestamp, so we must get the top
+                * 16 bits from the timesync event state.
+                *
+                * We only have the lower 16 bits of the time now, but we do
+                * have a full resolution timestamp at some point in past. As
+                * long as the difference between the (real) now and the sync
+                * is less than 2^15, then we can reconstruct the difference
+                * between those two numbers using only the lower 16 bits of
+                * each.
+                *
+                * Put another way
+                *
+                * a - b = ((a mod k) - b) mod k
+                *
+                * when -k/2 < (a-b) < k/2. In our case k is 2^16. We know
+                * (a mod k) and b, so can calculate the delta, a - b.
+                *
+                */
+               sync_timestamp = last_sync_timestamp_major(efx);
+
+               /* Because delta is s16 this does an implicit mask down to
+                * 16 bits which is what we need, assuming
+                * MEDFORD_TX_SECS_EVENT_BITS is 16. delta is signed so that
+                * we can deal with the (unlikely) case of sync timestamps
+                * arriving from the future.
+                */
+               delta = nic_major - sync_timestamp;
+
+               /* Recover the fully specified time now, by applying the offset
+                * to the (fully specified) sync time.
+                */
+               nic_major = sync_timestamp + delta;
  
                 kt = ptp->nic_to_kernel_time(nic_major, nic_minor,
                                              correction);
diff --git a/drivers/net/ethernet/socionext/sni_ave.c b/drivers/net/ethernet/socionext/sni_ave.c

index b7032422393f69bd8c3ecd97bd8bee83258fdfff..67ddf782d98a5ab1c685b4fe8f7b4ae9acac7bf9 100644 (file)
--- a/drivers/net/ethernet/socionext/sni_ave.c
+++ b/drivers/net/ethernet/socionext/sni_ave.c
@@ -1810,6 +1810,9 @@ static int ave_pro4_get_pinmode(struct ave_private *priv,
                 break;
         case PHY_INTERFACE_MODE_MII:
         case PHY_INTERFACE_MODE_RGMII:
+       case PHY_INTERFACE_MODE_RGMII_ID:
+       case PHY_INTERFACE_MODE_RGMII_RXID:
+       case PHY_INTERFACE_MODE_RGMII_TXID:
                 priv->pinmode_val = 0;
                 break;
         default:
@@ -1854,6 +1857,9 @@ static int ave_ld20_get_pinmode(struct ave_private *priv,
                 priv->pinmode_val = SG_ETPINMODE_RMII(0);
                 break;
         case PHY_INTERFACE_MODE_RGMII:
+       case PHY_INTERFACE_MODE_RGMII_ID:
+       case PHY_INTERFACE_MODE_RGMII_RXID:
+       case PHY_INTERFACE_MODE_RGMII_TXID:
                 priv->pinmode_val = 0;
                 break;
         default:
@@ -1876,6 +1882,9 @@ static int ave_pxs3_get_pinmode(struct ave_private *priv,
                 priv->pinmode_val = SG_ETPINMODE_RMII(arg);
                 break;
         case PHY_INTERFACE_MODE_RGMII:
+       case PHY_INTERFACE_MODE_RGMII_ID:
+       case PHY_INTERFACE_MODE_RGMII_RXID:
+       case PHY_INTERFACE_MODE_RGMII_TXID:
                 priv->pinmode_val = 0;
                 break;
         default:
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c

index 5836b21edd7ed7b54604acb60f0d95501c1169d3..7da18c9afa01d2843d431f47430d58a6df171e05 100644 (file)
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -4405,6 +4405,8 @@ static void stmmac_init_fs(struct net_device *dev)
  {
         struct stmmac_priv *priv = netdev_priv(dev);
  
+       rtnl_lock();
+
         /* Create per netdev entries */
         priv->dbgfs_dir = debugfs_create_dir(dev->name, stmmac_fs_dir);
  
@@ -4416,14 +4418,13 @@ static void stmmac_init_fs(struct net_device *dev)
         debugfs_create_file("dma_cap", 0444, priv->dbgfs_dir, dev,
                             &stmmac_dma_cap_fops);
  
-       register_netdevice_notifier(&stmmac_notifier);
+       rtnl_unlock();
  }
  
  static void stmmac_exit_fs(struct net_device *dev)
  {
         struct stmmac_priv *priv = netdev_priv(dev);
  
-       unregister_netdevice_notifier(&stmmac_notifier);
         debugfs_remove_recursive(priv->dbgfs_dir);
  }
  #endif /* CONFIG_DEBUG_FS */
@@ -4940,14 +4941,14 @@ int stmmac_dvr_remove(struct device *dev)
  
         netdev_info(priv->dev, "%s: removing driver", __func__);
  
-#ifdef CONFIG_DEBUG_FS
-       stmmac_exit_fs(ndev);
-#endif
         stmmac_stop_all_dma(priv);
  
         stmmac_mac_set(priv, priv->ioaddr, false);
         netif_carrier_off(ndev);
         unregister_netdev(ndev);
+#ifdef CONFIG_DEBUG_FS
+       stmmac_exit_fs(ndev);
+#endif
         phylink_destroy(priv->phylink);
         if (priv->plat->stmmac_rst)
                 reset_control_assert(priv->plat->stmmac_rst);
@@ -5166,6 +5167,7 @@ static int __init stmmac_init(void)
         /* Create debugfs main directory if it doesn't exist yet */
         if (!stmmac_fs_dir)
                 stmmac_fs_dir = debugfs_create_dir(STMMAC_RESOURCE_NAME, NULL);
+       register_netdevice_notifier(&stmmac_notifier);
  #endif
  
         return 0;
@@ -5174,6 +5176,7 @@ static int __init stmmac_init(void)
  static void __exit stmmac_exit(void)
  {
  #ifdef CONFIG_DEBUG_FS
+       unregister_netdevice_notifier(&stmmac_notifier);
         debugfs_remove_recursive(stmmac_fs_dir);
  #endif
  }
diff --git a/drivers/net/ethernet/sun/sunvnet_common.c b/drivers/net/ethernet/sun/sunvnet_common.c

index c23ce838ff631280e981c9e0ab9d0bfce16c3ea9..8dc6c9ff22e1f289b5422ab1199c0867a6d3445b 100644 (file)
--- a/drivers/net/ethernet/sun/sunvnet_common.c
+++ b/drivers/net/ethernet/sun/sunvnet_common.c
@@ -1350,27 +1350,12 @@ sunvnet_start_xmit_common(struct sk_buff *skb, struct net_device *dev,
                 if (vio_version_after_eq(&port->vio, 1, 3))
                         localmtu -= VLAN_HLEN;
  
-               if (skb->protocol == htons(ETH_P_IP)) {
-                       struct flowi4 fl4;
-                       struct rtable *rt = NULL;
-
-                       memset(&fl4, 0, sizeof(fl4));
-                       fl4.flowi4_oif = dev->ifindex;
-                       fl4.flowi4_tos = RT_TOS(ip_hdr(skb)->tos);
-                       fl4.daddr = ip_hdr(skb)->daddr;
-                       fl4.saddr = ip_hdr(skb)->saddr;
-
-                       rt = ip_route_output_key(dev_net(dev), &fl4);
-                       if (!IS_ERR(rt)) {
-                               skb_dst_set(skb, &rt->dst);
-                               icmp_send(skb, ICMP_DEST_UNREACH,
-                                         ICMP_FRAG_NEEDED,
-                                         htonl(localmtu));
-                       }
-               }
+               if (skb->protocol == htons(ETH_P_IP))
+                       icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
+                                     htonl(localmtu));
  #if IS_ENABLED(CONFIG_IPV6)
                 else if (skb->protocol == htons(ETH_P_IPV6))
-                       icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, localmtu);
+                       icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, localmtu);
  #endif
                 goto out_dropped;
         }
diff --git a/drivers/net/ethernet/xilinx/ll_temac.h b/drivers/net/ethernet/xilinx/ll_temac.h

index 276292bca334d853aace75bbefcfc95477e0491d..53fb8141f1a673e7e2be3a25d0873d8d26451eab 100644 (file)
--- a/drivers/net/ethernet/xilinx/ll_temac.h
+++ b/drivers/net/ethernet/xilinx/ll_temac.h
@@ -375,10 +375,14 @@ struct temac_local {
         int tx_bd_next;
         int tx_bd_tail;
         int rx_bd_ci;
+       int rx_bd_tail;
  
         /* DMA channel control setup */
         u32 tx_chnl_ctrl;
         u32 rx_chnl_ctrl;
+       u8 coalesce_count_rx;
+
+       struct delayed_work restart_work;
  };
  
  /* Wrappers for temac_ior()/temac_iow() function pointers above */
diff --git a/drivers/net/ethernet/xilinx/ll_temac_main.c b/drivers/net/ethernet/xilinx/ll_temac_main.c

index 6f11f52c9a9ed31a41722e5737ab546b64400517..9461acec6f70f8a0e9d7de88444dd573cc0c8551 100644 (file)
--- a/drivers/net/ethernet/xilinx/ll_temac_main.c
+++ b/drivers/net/ethernet/xilinx/ll_temac_main.c
@@ -51,6 +51,7 @@
  #include <linux/ip.h>
  #include <linux/slab.h>
  #include <linux/interrupt.h>
+#include <linux/workqueue.h>
  #include <linux/dma-mapping.h>
  #include <linux/processor.h>
  #include <linux/platform_data/xilinx-ll-temac.h>
@@ -367,6 +368,8 @@ static int temac_dma_bd_init(struct net_device *ndev)
                 skb_dma_addr = dma_map_single(ndev->dev.parent, skb->data,
                                               XTE_MAX_JUMBO_FRAME_SIZE,
                                               DMA_FROM_DEVICE);
+               if (dma_mapping_error(ndev->dev.parent, skb_dma_addr))
+                       goto out;
                 lp->rx_bd_v[i].phys = cpu_to_be32(skb_dma_addr);
                 lp->rx_bd_v[i].len = cpu_to_be32(XTE_MAX_JUMBO_FRAME_SIZE);
                 lp->rx_bd_v[i].app0 = cpu_to_be32(STS_CTRL_APP0_IRQONEND);
@@ -387,12 +390,13 @@ static int temac_dma_bd_init(struct net_device *ndev)
         lp->tx_bd_next = 0;
         lp->tx_bd_tail = 0;
         lp->rx_bd_ci = 0;
+       lp->rx_bd_tail = RX_BD_NUM - 1;
  
         /* Enable RX DMA transfers */
         wmb();
         lp->dma_out(lp, RX_CURDESC_PTR,  lp->rx_bd_p);
         lp->dma_out(lp, RX_TAILDESC_PTR,
-                      lp->rx_bd_p + (sizeof(*lp->rx_bd_v) * (RX_BD_NUM - 1)));
+                      lp->rx_bd_p + (sizeof(*lp->rx_bd_v) * lp->rx_bd_tail));
  
         /* Prepare for TX DMA transfer */
         lp->dma_out(lp, TX_CURDESC_PTR, lp->tx_bd_p);
@@ -788,6 +792,9 @@ static void temac_start_xmit_done(struct net_device *ndev)
                 stat = be32_to_cpu(cur_p->app0);
         }
  
+       /* Matches barrier in temac_start_xmit */
+       smp_mb();
+
         netif_wake_queue(ndev);
  }
  
@@ -830,9 +837,19 @@ temac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
         cur_p = &lp->tx_bd_v[lp->tx_bd_tail];
  
         if (temac_check_tx_bd_space(lp, num_frag + 1)) {
-               if (!netif_queue_stopped(ndev))
-                       netif_stop_queue(ndev);
-               return NETDEV_TX_BUSY;
+               if (netif_queue_stopped(ndev))
+                       return NETDEV_TX_BUSY;
+
+               netif_stop_queue(ndev);
+
+               /* Matches barrier in temac_start_xmit_done */
+               smp_mb();
+
+               /* Space might have just been freed - check again */
+               if (temac_check_tx_bd_space(lp, num_frag))
+                       return NETDEV_TX_BUSY;
+
+               netif_wake_queue(ndev);
         }
  
         cur_p->app0 = 0;
@@ -850,12 +867,16 @@ temac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
         skb_dma_addr = dma_map_single(ndev->dev.parent, skb->data,
                                       skb_headlen(skb), DMA_TO_DEVICE);
         cur_p->len = cpu_to_be32(skb_headlen(skb));
+       if (WARN_ON_ONCE(dma_mapping_error(ndev->dev.parent, skb_dma_addr))) {
+               dev_kfree_skb_any(skb);
+               ndev->stats.tx_dropped++;
+               return NETDEV_TX_OK;
+       }
         cur_p->phys = cpu_to_be32(skb_dma_addr);
         ptr_to_txbd((void *)skb, cur_p);
  
         for (ii = 0; ii < num_frag; ii++) {
-               lp->tx_bd_tail++;
-               if (lp->tx_bd_tail >= TX_BD_NUM)
+               if (++lp->tx_bd_tail >= TX_BD_NUM)
                         lp->tx_bd_tail = 0;
  
                 cur_p = &lp->tx_bd_v[lp->tx_bd_tail];
@@ -863,6 +884,27 @@ temac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
                                               skb_frag_address(frag),
                                               skb_frag_size(frag),
                                               DMA_TO_DEVICE);
+               if (dma_mapping_error(ndev->dev.parent, skb_dma_addr)) {
+                       if (--lp->tx_bd_tail < 0)
+                               lp->tx_bd_tail = TX_BD_NUM - 1;
+                       cur_p = &lp->tx_bd_v[lp->tx_bd_tail];
+                       while (--ii >= 0) {
+                               --frag;
+                               dma_unmap_single(ndev->dev.parent,
+                                                be32_to_cpu(cur_p->phys),
+                                                skb_frag_size(frag),
+                                                DMA_TO_DEVICE);
+                               if (--lp->tx_bd_tail < 0)
+                                       lp->tx_bd_tail = TX_BD_NUM - 1;
+                               cur_p = &lp->tx_bd_v[lp->tx_bd_tail];
+                       }
+                       dma_unmap_single(ndev->dev.parent,
+                                        be32_to_cpu(cur_p->phys),
+                                        skb_headlen(skb), DMA_TO_DEVICE);
+                       dev_kfree_skb_any(skb);
+                       ndev->stats.tx_dropped++;
+                       return NETDEV_TX_OK;
+               }
                 cur_p->phys = cpu_to_be32(skb_dma_addr);
                 cur_p->len = cpu_to_be32(skb_frag_size(frag));
                 cur_p->app0 = 0;
@@ -884,31 +926,56 @@ temac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
         return NETDEV_TX_OK;
  }
  
+static int ll_temac_recv_buffers_available(struct temac_local *lp)
+{
+       int available;
+
+       if (!lp->rx_skb[lp->rx_bd_ci])
+               return 0;
+       available = 1 + lp->rx_bd_tail - lp->rx_bd_ci;
+       if (available <= 0)
+               available += RX_BD_NUM;
+       return available;
+}
  
  static void ll_temac_recv(struct net_device *ndev)
  {
         struct temac_local *lp = netdev_priv(ndev);
-       struct sk_buff *skb, *new_skb;
-       unsigned int bdstat;
-       struct cdmac_bd *cur_p;
-       dma_addr_t tail_p, skb_dma_addr;
-       int length;
         unsigned long flags;
+       int rx_bd;
+       bool update_tail = false;
  
         spin_lock_irqsave(&lp->rx_lock, flags);
  
-       tail_p = lp->rx_bd_p + sizeof(*lp->rx_bd_v) * lp->rx_bd_ci;
-       cur_p = &lp->rx_bd_v[lp->rx_bd_ci];
-
-       bdstat = be32_to_cpu(cur_p->app0);
-       while ((bdstat & STS_CTRL_APP0_CMPLT)) {
+       /* Process all received buffers, passing them on network
+        * stack.  After this, the buffer descriptors will be in an
+        * un-allocated stage, where no skb is allocated for it, and
+        * they are therefore not available for TEMAC/DMA.
+        */
+       do {
+               struct cdmac_bd *bd = &lp->rx_bd_v[lp->rx_bd_ci];
+               struct sk_buff *skb = lp->rx_skb[lp->rx_bd_ci];
+               unsigned int bdstat = be32_to_cpu(bd->app0);
+               int length;
+
+               /* While this should not normally happen, we can end
+                * here when GFP_ATOMIC allocations fail, and we
+                * therefore have un-allocated buffers.
+                */
+               if (!skb)
+                       break;
  
-               skb = lp->rx_skb[lp->rx_bd_ci];
-               length = be32_to_cpu(cur_p->app4) & 0x3FFF;
+               /* Loop over all completed buffer descriptors */
+               if (!(bdstat & STS_CTRL_APP0_CMPLT))
+                       break;
  
-               dma_unmap_single(ndev->dev.parent, be32_to_cpu(cur_p->phys),
+               dma_unmap_single(ndev->dev.parent, be32_to_cpu(bd->phys),
                                  XTE_MAX_JUMBO_FRAME_SIZE, DMA_FROM_DEVICE);
+               /* The buffer is not valid for DMA anymore */
+               bd->phys = 0;
+               bd->len = 0;
  
+               length = be32_to_cpu(bd->app4) & 0x3FFF;
                 skb_put(skb, length);
                 skb->protocol = eth_type_trans(skb, ndev);
                 skb_checksum_none_assert(skb);
@@ -923,43 +990,102 @@ static void ll_temac_recv(struct net_device *ndev)
                          * (back) for proper IP checksum byte order
                          * (be16).
                          */
-                       skb->csum = htons(be32_to_cpu(cur_p->app3) & 0xFFFF);
+                       skb->csum = htons(be32_to_cpu(bd->app3) & 0xFFFF);
                         skb->ip_summed = CHECKSUM_COMPLETE;
                 }
  
                 if (!skb_defer_rx_timestamp(skb))
                         netif_rx(skb);
+               /* The skb buffer is now owned by network stack above */
+               lp->rx_skb[lp->rx_bd_ci] = NULL;
  
                 ndev->stats.rx_packets++;
                 ndev->stats.rx_bytes += length;
  
-               new_skb = netdev_alloc_skb_ip_align(ndev,
-                                               XTE_MAX_JUMBO_FRAME_SIZE);
-               if (!new_skb) {
-                       spin_unlock_irqrestore(&lp->rx_lock, flags);
-                       return;
+               rx_bd = lp->rx_bd_ci;
+               if (++lp->rx_bd_ci >= RX_BD_NUM)
+                       lp->rx_bd_ci = 0;
+       } while (rx_bd != lp->rx_bd_tail);
+
+       /* DMA operations will halt when the last buffer descriptor is
+        * processed (ie. the one pointed to by RX_TAILDESC_PTR).
+        * When that happens, no more interrupt events will be
+        * generated.  No IRQ_COAL or IRQ_DLY, and not even an
+        * IRQ_ERR.  To avoid stalling, we schedule a delayed work
+        * when there is a potential risk of that happening.  The work
+        * will call this function, and thus re-schedule itself until
+        * enough buffers are available again.
+        */
+       if (ll_temac_recv_buffers_available(lp) < lp->coalesce_count_rx)
+               schedule_delayed_work(&lp->restart_work, HZ / 1000);
+
+       /* Allocate new buffers for those buffer descriptors that were
+        * passed to network stack.  Note that GFP_ATOMIC allocations
+        * can fail (e.g. when a larger burst of GFP_ATOMIC
+        * allocations occurs), so while we try to allocate all
+        * buffers in the same interrupt where they were processed, we
+        * continue with what we could get in case of allocation
+        * failure.  Allocation of remaining buffers will be retried
+        * in following calls.
+        */
+       while (1) {
+               struct sk_buff *skb;
+               struct cdmac_bd *bd;
+               dma_addr_t skb_dma_addr;
+
+               rx_bd = lp->rx_bd_tail + 1;
+               if (rx_bd >= RX_BD_NUM)
+                       rx_bd = 0;
+               bd = &lp->rx_bd_v[rx_bd];
+
+               if (bd->phys)
+                       break;  /* All skb's allocated */
+
+               skb = netdev_alloc_skb_ip_align(ndev, XTE_MAX_JUMBO_FRAME_SIZE);
+               if (!skb) {
+                       dev_warn(&ndev->dev, "skb alloc failed\n");
+                       break;
                 }
  
-               cur_p->app0 = cpu_to_be32(STS_CTRL_APP0_IRQONEND);
-               skb_dma_addr = dma_map_single(ndev->dev.parent, new_skb->data,
+               skb_dma_addr = dma_map_single(ndev->dev.parent, skb->data,
                                               XTE_MAX_JUMBO_FRAME_SIZE,
                                               DMA_FROM_DEVICE);
-               cur_p->phys = cpu_to_be32(skb_dma_addr);
-               cur_p->len = cpu_to_be32(XTE_MAX_JUMBO_FRAME_SIZE);
-               lp->rx_skb[lp->rx_bd_ci] = new_skb;
+               if (WARN_ON_ONCE(dma_mapping_error(ndev->dev.parent,
+                                                  skb_dma_addr))) {
+                       dev_kfree_skb_any(skb);
+                       break;
+               }
  
-               lp->rx_bd_ci++;
-               if (lp->rx_bd_ci >= RX_BD_NUM)
-                       lp->rx_bd_ci = 0;
+               bd->phys = cpu_to_be32(skb_dma_addr);
+               bd->len = cpu_to_be32(XTE_MAX_JUMBO_FRAME_SIZE);
+               bd->app0 = cpu_to_be32(STS_CTRL_APP0_IRQONEND);
+               lp->rx_skb[rx_bd] = skb;
  
-               cur_p = &lp->rx_bd_v[lp->rx_bd_ci];
-               bdstat = be32_to_cpu(cur_p->app0);
+               lp->rx_bd_tail = rx_bd;
+               update_tail = true;
+       }
+
+       /* Move tail pointer when buffers have been allocated */
+       if (update_tail) {
+               lp->dma_out(lp, RX_TAILDESC_PTR,
+                       lp->rx_bd_p + sizeof(*lp->rx_bd_v) * lp->rx_bd_tail);
         }
-       lp->dma_out(lp, RX_TAILDESC_PTR, tail_p);
  
         spin_unlock_irqrestore(&lp->rx_lock, flags);
  }
  
+/* Function scheduled to ensure a restart in case of DMA halt
+ * condition caused by running out of buffer descriptors.
+ */
+static void ll_temac_restart_work_func(struct work_struct *work)
+{
+       struct temac_local *lp = container_of(work, struct temac_local,
+                                             restart_work.work);
+       struct net_device *ndev = lp->ndev;
+
+       ll_temac_recv(ndev);
+}
+
  static irqreturn_t ll_temac_tx_irq(int irq, void *_ndev)
  {
         struct net_device *ndev = _ndev;
@@ -1052,6 +1178,8 @@ static int temac_stop(struct net_device *ndev)
  
         dev_dbg(&ndev->dev, "temac_close()\n");
  
+       cancel_delayed_work_sync(&lp->restart_work);
+
         free_irq(lp->tx_irq, ndev);
         free_irq(lp->rx_irq, ndev);
  
@@ -1173,6 +1301,7 @@ static int temac_probe(struct platform_device *pdev)
         lp->dev = &pdev->dev;
         lp->options = XTE_OPTION_DEFAULTS;
         spin_lock_init(&lp->rx_lock);
+       INIT_DELAYED_WORK(&lp->restart_work, ll_temac_restart_work_func);
  
         /* Setup mutex for synchronization of indirect register access */
         if (pdata) {
@@ -1279,6 +1408,7 @@ static int temac_probe(struct platform_device *pdev)
                  */
                 lp->tx_chnl_ctrl = 0x10220000;
                 lp->rx_chnl_ctrl = 0xff070000;
+               lp->coalesce_count_rx = 0x07;
  
                 /* Finished with the DMA node; drop the reference */
                 of_node_put(dma_np);
@@ -1310,11 +1440,14 @@ static int temac_probe(struct platform_device *pdev)
                                 (pdata->tx_irq_count << 16);
                 else
                         lp->tx_chnl_ctrl = 0x10220000;
-               if (pdata->rx_irq_timeout || pdata->rx_irq_count)
+               if (pdata->rx_irq_timeout || pdata->rx_irq_count) {
                         lp->rx_chnl_ctrl = (pdata->rx_irq_timeout << 24) |
                                 (pdata->rx_irq_count << 16);
-               else
+                       lp->coalesce_count_rx = pdata->rx_irq_count;
+               } else {
                         lp->rx_chnl_ctrl = 0xff070000;
+                       lp->coalesce_count_rx = 0x07;
+               }
         }
  
         /* Error handle returned DMA RX and TX interrupts */
diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c

index af07ea760b359ba99a550e733f8327a6144b5957..672cd2caf2fbec6f1020f4dc30adfb8d96a8f917 100644 (file)
--- a/drivers/net/gtp.c
+++ b/drivers/net/gtp.c
@@ -546,8 +546,8 @@ static int gtp_build_skb_ip4(struct sk_buff *skb, struct net_device *dev,
             mtu < ntohs(iph->tot_len)) {
                 netdev_dbg(dev, "packet too big, fragmentation needed\n");
                 memset(IPCB(skb), 0, sizeof(*IPCB(skb)));
-               icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
-                         htonl(mtu));
+               icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
+                             htonl(mtu));
                 goto err_rt;
         }
  
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c

index ae3f3084c2ed29bd19522f27f51283275310beaa..1b320bcf150a4ddc91204e21b3bac4a1140cad5e 100644 (file)
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -99,7 +99,7 @@ static struct netvsc_device *alloc_net_device(void)
  
         init_waitqueue_head(&net_device->wait_drain);
         net_device->destroy = false;
-       net_device->tx_disable = false;
+       net_device->tx_disable = true;
  
         net_device->max_pkt = RNDIS_MAX_PKT_DEFAULT;
         net_device->pkt_align = RNDIS_PKT_ALIGN_DEFAULT;
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c

index 65e12cb07f453fb0472d5bb527748d6ee8a18f92..2c0a24c606fc7dc49cbf247d22e8082f2cad2013 100644 (file)
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -1068,6 +1068,7 @@ static int netvsc_attach(struct net_device *ndev,
         }
  
         /* In any case device is now ready */
+       nvdev->tx_disable = false;
         netif_device_attach(ndev);
  
         /* Note: enable and attach happen when sub-channels setup */
@@ -2476,6 +2477,8 @@ static int netvsc_probe(struct hv_device *dev,
         else
                 net->max_mtu = ETH_DATA_LEN;
  
+       nvdev->tx_disable = false;
+
         ret = register_netdevice(net);
         if (ret != 0) {
                 pr_err("Unable to register netdev.\n");
diff --git a/drivers/net/phy/broadcom.c b/drivers/net/phy/broadcom.c

index 7d68b28bb8938de99fb2f979cd33df9975e82903..a62229a8b1a41eca2de85cf2e3574ea31953eb95 100644 (file)
--- a/drivers/net/phy/broadcom.c
+++ b/drivers/net/phy/broadcom.c
@@ -410,7 +410,7 @@ static int bcm5481_config_aneg(struct phy_device *phydev)
         struct device_node *np = phydev->mdio.dev.of_node;
         int ret;
  
-       /* Aneg firsly. */
+       /* Aneg firstly. */
         ret = genphy_config_aneg(phydev);
  
         /* Then we can set up the delay. */
@@ -463,7 +463,7 @@ static int bcm54616s_config_aneg(struct phy_device *phydev)
  {
         int ret;
  
-       /* Aneg firsly. */
+       /* Aneg firstly. */
         if (phydev->dev_flags & PHY_BCM_FLAGS_MODE_1000BX)
                 ret = genphy_c37_config_aneg(phydev);
         else
diff --git a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c

index 28e33ece4ce17aa132a214d2be32761bf3827cb4..9a8badafea8ac95e123d14694437f60bc44eb453 100644 (file)
--- a/drivers/net/phy/marvell.c
+++ b/drivers/net/phy/marvell.c
@@ -1306,6 +1306,9 @@ static int marvell_read_status_page_an(struct phy_device *phydev,
                 }
         }
  
+       if (!(status & MII_M1011_PHY_STATUS_RESOLVED))
+               return 0;
+
         if (status & MII_M1011_PHY_STATUS_FULLDUPLEX)
                 phydev->duplex = DUPLEX_FULL;
         else
@@ -1365,6 +1368,8 @@ static int marvell_read_status_page(struct phy_device *phydev, int page)
         linkmode_zero(phydev->lp_advertising);
         phydev->pause = 0;
         phydev->asym_pause = 0;
+       phydev->speed = SPEED_UNKNOWN;
+       phydev->duplex = DUPLEX_UNKNOWN;
  
         if (phydev->autoneg == AUTONEG_ENABLE)
                 err = marvell_read_status_page_an(phydev, fiber, status);
diff --git a/drivers/net/phy/mdio-bcm-iproc.c b/drivers/net/phy/mdio-bcm-iproc.c

index 7e9975d2506691ee77f319a6fb0741596a8e1bde..f1ded03f0229b1460ba992166434f3290ec23c37 100644 (file)
--- a/drivers/net/phy/mdio-bcm-iproc.c
+++ b/drivers/net/phy/mdio-bcm-iproc.c
@@ -178,6 +178,23 @@ static int iproc_mdio_remove(struct platform_device *pdev)
         return 0;
  }
  
+#ifdef CONFIG_PM_SLEEP
+int iproc_mdio_resume(struct device *dev)
+{
+       struct platform_device *pdev = to_platform_device(dev);
+       struct iproc_mdio_priv *priv = platform_get_drvdata(pdev);
+
+       /* restore the mii clock configuration */
+       iproc_mdio_config_clk(priv->base);
+
+       return 0;
+}
+
+static const struct dev_pm_ops iproc_mdio_pm_ops = {
+       .resume = iproc_mdio_resume
+};
+#endif /* CONFIG_PM_SLEEP */
+
  static const struct of_device_id iproc_mdio_of_match[] = {
         { .compatible = "brcm,iproc-mdio", },
         { /* sentinel */ },
@@ -188,6 +205,9 @@ static struct platform_driver iproc_mdio_driver = {
         .driver = {
                 .name = "iproc-mdio",
                 .of_match_table = iproc_mdio_of_match,
+#ifdef CONFIG_PM_SLEEP
+               .pm = &iproc_mdio_pm_ops,
+#endif
         },
         .probe = iproc_mdio_probe,
         .remove = iproc_mdio_remove,
diff --git a/drivers/net/phy/mscc.c b/drivers/net/phy/mscc.c

index 937ac7da278944c3d9c39cd5b16a879ea116b2b6..f686f40f6bdcceb1807275f02dc57abd84c666a4 100644 (file)
--- a/drivers/net/phy/mscc.c
+++ b/drivers/net/phy/mscc.c
@@ -345,11 +345,11 @@ enum macsec_bank {
                                 BIT(VSC8531_FORCE_LED_OFF) | \
                                 BIT(VSC8531_FORCE_LED_ON))
  
-#define MSCC_VSC8584_REVB_INT8051_FW           "mscc_vsc8584_revb_int8051_fb48.bin"
+#define MSCC_VSC8584_REVB_INT8051_FW           "microchip/mscc_vsc8584_revb_int8051_fb48.bin"
  #define MSCC_VSC8584_REVB_INT8051_FW_START_ADDR        0xe800
  #define MSCC_VSC8584_REVB_INT8051_FW_CRC       0xfb48
  
-#define MSCC_VSC8574_REVB_INT8051_FW           "mscc_vsc8574_revb_int8051_29e8.bin"
+#define MSCC_VSC8574_REVB_INT8051_FW           "microchip/mscc_vsc8574_revb_int8051_29e8.bin"
  #define MSCC_VSC8574_REVB_INT8051_FW_START_ADDR        0x4000
  #define MSCC_VSC8574_REVB_INT8051_FW_CRC       0x29e8
  
diff --git a/drivers/net/phy/phy-c45.c b/drivers/net/phy/phy-c45.c

index a1caeee1223617dab21b488858b14b5d0ef2aa2a..dd2e23fb67c068674144f5c1afc97ca682bb6b8f 100644 (file)
--- a/drivers/net/phy/phy-c45.c
+++ b/drivers/net/phy/phy-c45.c
@@ -167,7 +167,7 @@ EXPORT_SYMBOL_GPL(genphy_c45_restart_aneg);
   */
  int genphy_c45_check_and_restart_aneg(struct phy_device *phydev, bool restart)
  {
-       int ret = 0;
+       int ret;
  
         if (!restart) {
                 /* Configure and restart aneg if it wasn't set before */
@@ -180,9 +180,9 @@ int genphy_c45_check_and_restart_aneg(struct phy_device *phydev, bool restart)
         }
  
         if (restart)
-               ret = genphy_c45_restart_aneg(phydev);
+               return genphy_c45_restart_aneg(phydev);
  
-       return ret;
+       return 0;
  }
  EXPORT_SYMBOL_GPL(genphy_c45_check_and_restart_aneg);
  
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c

index 6a5056e0ae77578cc44006eaec2fef72cf646717..c8b0c34030d32cdf7cac3acfdc50903da5eecf0f 100644 (file)
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -247,7 +247,7 @@ static bool mdio_bus_phy_may_suspend(struct phy_device *phydev)
          * MDIO bus driver and clock gated at this point.
          */
         if (!netdev)
-               return !phydev->suspended;
+               goto out;
  
         if (netdev->wol_enabled)
                 return false;
@@ -267,7 +267,8 @@ static bool mdio_bus_phy_may_suspend(struct phy_device *phydev)
         if (device_may_wakeup(&netdev->dev))
                 return false;
  
-       return true;
+out:
+       return !phydev->suspended;
  }
  
  static int mdio_bus_phy_suspend(struct device *dev)
@@ -1792,7 +1793,7 @@ EXPORT_SYMBOL(genphy_restart_aneg);
   */
  int genphy_check_and_restart_aneg(struct phy_device *phydev, bool restart)
  {
-       int ret = 0;
+       int ret;
  
         if (!restart) {
                 /* Advertisement hasn't changed, but maybe aneg was never on to
@@ -1807,9 +1808,9 @@ int genphy_check_and_restart_aneg(struct phy_device *phydev, bool restart)
         }
  
         if (restart)
-               ret = genphy_restart_aneg(phydev);
+               return genphy_restart_aneg(phydev);
  
-       return ret;
+       return 0;
  }
  EXPORT_SYMBOL(genphy_check_and_restart_aneg);
  
diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c

index 6f4d7ba8b1094eac73b0ea503bee81d9d976178c..babb01888b786c2e77c9d7b4844f44d31b9c7bce 100644 (file)
--- a/drivers/net/slip/slip.c
+++ b/drivers/net/slip/slip.c
@@ -863,7 +863,10 @@ err_free_chan:
         tty->disc_data = NULL;
         clear_bit(SLF_INUSE, &sl->flags);
         sl_free_netdev(sl->dev);
+       /* do not call free_netdev before rtnl_unlock */
+       rtnl_unlock();
         free_netdev(sl->dev);
+       return err;
  
  err_exit:
         rtnl_unlock();
diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c

index 9485c8d1de8a37c78b210dd2a9b41aea5c1c2eb7..5754bb6ca0eeccc129b16b3b9ff433d067d00e34 100644 (file)
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -61,7 +61,6 @@ enum qmi_wwan_flags {
  
  enum qmi_wwan_quirks {
         QMI_WWAN_QUIRK_DTR = 1 << 0,    /* needs "set DTR" request */
-       QMI_WWAN_QUIRK_QUECTEL_DYNCFG = 1 << 1, /* check num. endpoints */
  };
  
  struct qmimux_hdr {
@@ -338,6 +337,9 @@ static void qmi_wwan_netdev_setup(struct net_device *net)
                 netdev_dbg(net, "mode: raw IP\n");
         } else if (!net->header_ops) { /* don't bother if already set */
                 ether_setup(net);
+               /* Restoring min/max mtu values set originally by usbnet */
+               net->min_mtu = 0;
+               net->max_mtu = ETH_MAX_MTU;
                 clear_bit(EVENT_NO_IP_ALIGN, &dev->flags);
                 netdev_dbg(net, "mode: Ethernet\n");
         }
@@ -916,16 +918,6 @@ static const struct driver_info    qmi_wwan_info_quirk_dtr = {
         .data           = QMI_WWAN_QUIRK_DTR,
  };
  
-static const struct driver_info        qmi_wwan_info_quirk_quectel_dyncfg = {
-       .description    = "WWAN/QMI device",
-       .flags          = FLAG_WWAN | FLAG_SEND_ZLP,
-       .bind           = qmi_wwan_bind,
-       .unbind         = qmi_wwan_unbind,
-       .manage_power   = qmi_wwan_manage_power,
-       .rx_fixup       = qmi_wwan_rx_fixup,
-       .data           = QMI_WWAN_QUIRK_DTR | QMI_WWAN_QUIRK_QUECTEL_DYNCFG,
-};
-
  #define HUAWEI_VENDOR_ID       0x12D1
  
  /* map QMI/wwan function by a fixed interface number */
@@ -946,14 +938,18 @@ static const struct driver_info   qmi_wwan_info_quirk_quectel_dyncfg = {
  #define QMI_GOBI_DEVICE(vend, prod) \
         QMI_FIXED_INTF(vend, prod, 0)
  
-/* Quectel does not use fixed interface numbers on at least some of their
- * devices. We need to check the number of endpoints to ensure that we bind to
- * the correct interface.
+/* Many devices have QMI and DIAG functions which are distinguishable
+ * from other vendor specific functions by class, subclass and
+ * protocol all being 0xff. The DIAG function has exactly 2 endpoints
+ * and is silently rejected when probed.
+ *
+ * This makes it possible to match dynamically numbered QMI functions
+ * as seen on e.g. many Quectel modems.
   */
-#define QMI_QUIRK_QUECTEL_DYNCFG(vend, prod) \
+#define QMI_MATCH_FF_FF_FF(vend, prod) \
         USB_DEVICE_AND_INTERFACE_INFO(vend, prod, USB_CLASS_VENDOR_SPEC, \
                                       USB_SUBCLASS_VENDOR_SPEC, 0xff), \
-       .driver_info = (unsigned long)&qmi_wwan_info_quirk_quectel_dyncfg
+       .driver_info = (unsigned long)&qmi_wwan_info_quirk_dtr
  
  static const struct usb_device_id products[] = {
         /* 1. CDC ECM like devices match on the control interface */
@@ -1059,10 +1055,10 @@ static const struct usb_device_id products[] = {
                 USB_DEVICE_AND_INTERFACE_INFO(0x03f0, 0x581d, USB_CLASS_VENDOR_SPEC, 1, 7),
                 .driver_info = (unsigned long)&qmi_wwan_info,
         },
-       {QMI_QUIRK_QUECTEL_DYNCFG(0x2c7c, 0x0125)},     /* Quectel EC25, EC20 R2.0  Mini PCIe */
-       {QMI_QUIRK_QUECTEL_DYNCFG(0x2c7c, 0x0306)},     /* Quectel EP06/EG06/EM06 */
-       {QMI_QUIRK_QUECTEL_DYNCFG(0x2c7c, 0x0512)},     /* Quectel EG12/EM12 */
-       {QMI_QUIRK_QUECTEL_DYNCFG(0x2c7c, 0x0800)},     /* Quectel RM500Q-GL */
+       {QMI_MATCH_FF_FF_FF(0x2c7c, 0x0125)},   /* Quectel EC25, EC20 R2.0  Mini PCIe */
+       {QMI_MATCH_FF_FF_FF(0x2c7c, 0x0306)},   /* Quectel EP06/EG06/EM06 */
+       {QMI_MATCH_FF_FF_FF(0x2c7c, 0x0512)},   /* Quectel EG12/EM12 */
+       {QMI_MATCH_FF_FF_FF(0x2c7c, 0x0800)},   /* Quectel RM500Q-GL */
  
         /* 3. Combined interface devices matching on interface number */
         {QMI_FIXED_INTF(0x0408, 0xea42, 4)},    /* Yota / Megafon M100-1 */
@@ -1363,6 +1359,7 @@ static const struct usb_device_id products[] = {
         {QMI_FIXED_INTF(0x413c, 0x81b6, 8)},    /* Dell Wireless 5811e */
         {QMI_FIXED_INTF(0x413c, 0x81b6, 10)},   /* Dell Wireless 5811e */
         {QMI_FIXED_INTF(0x413c, 0x81d7, 0)},    /* Dell Wireless 5821e */
+       {QMI_FIXED_INTF(0x413c, 0x81d7, 1)},    /* Dell Wireless 5821e preproduction config */
         {QMI_FIXED_INTF(0x413c, 0x81e0, 0)},    /* Dell Wireless 5821e with eSIM support*/
         {QMI_FIXED_INTF(0x03f0, 0x4e1d, 8)},    /* HP lt4111 LTE/EV-DO/HSPA+ Gobi 4G Module */
         {QMI_FIXED_INTF(0x03f0, 0x9d1d, 1)},    /* HP lt4120 Snapdragon X5 LTE */
@@ -1454,7 +1451,6 @@ static int qmi_wwan_probe(struct usb_interface *intf,
  {
         struct usb_device_id *id = (struct usb_device_id *)prod;
         struct usb_interface_descriptor *desc = &intf->cur_altsetting->desc;
-       const struct driver_info *info;
  
         /* Workaround to enable dynamic IDs.  This disables usbnet
          * blacklisting functionality.  Which, if required, can be
@@ -1490,12 +1486,8 @@ static int qmi_wwan_probe(struct usb_interface *intf,
          * different. Ignore the current interface if the number of endpoints
          * equals the number for the diag interface (two).
          */
-       info = (void *)id->driver_info;
-
-       if (info->data & QMI_WWAN_QUIRK_QUECTEL_DYNCFG) {
-               if (desc->bNumEndpoints == 2)
-                       return -ENODEV;
-       }
+       if (desc->bNumEndpoints == 2)
+               return -ENODEV;
  
         return usbnet_probe(intf, id);
  }
diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c

index 16b19824b9ad0385404fb6f084407df0a4033518..cdc96968b0f4ba93c5e078f5ec46ace576f9d214 100644 (file)
--- a/drivers/net/wireguard/device.c
+++ b/drivers/net/wireguard/device.c
@@ -203,9 +203,9 @@ err_peer:
  err:
         ++dev->stats.tx_errors;
         if (skb->protocol == htons(ETH_P_IP))
-               icmp_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_UNREACH, 0);
+               icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_UNREACH, 0);
         else if (skb->protocol == htons(ETH_P_IPV6))
-               icmpv6_send(skb, ICMPV6_DEST_UNREACH, ICMPV6_ADDR_UNREACH, 0);
+               icmpv6_ndo_send(skb, ICMPV6_DEST_UNREACH, ICMPV6_ADDR_UNREACH, 0);
         kfree_skb(skb);
         return ret;
  }
@@ -258,6 +258,8 @@ static void wg_setup(struct net_device *dev)
         enum { WG_NETDEV_FEATURES = NETIF_F_HW_CSUM | NETIF_F_RXCSUM |
                                     NETIF_F_SG | NETIF_F_GSO |
                                     NETIF_F_GSO_SOFTWARE | NETIF_F_HIGHDMA };
+       const int overhead = MESSAGE_MINIMUM_LENGTH + sizeof(struct udphdr) +
+                            max(sizeof(struct ipv6hdr), sizeof(struct iphdr));
  
         dev->netdev_ops = &netdev_ops;
         dev->hard_header_len = 0;
@@ -271,9 +273,8 @@ static void wg_setup(struct net_device *dev)
         dev->features |= WG_NETDEV_FEATURES;
         dev->hw_features |= WG_NETDEV_FEATURES;
         dev->hw_enc_features |= WG_NETDEV_FEATURES;
-       dev->mtu = ETH_DATA_LEN - MESSAGE_MINIMUM_LENGTH -
-                  sizeof(struct udphdr) -
-                  max(sizeof(struct ipv6hdr), sizeof(struct iphdr));
+       dev->mtu = ETH_DATA_LEN - overhead;
+       dev->max_mtu = round_down(INT_MAX, MESSAGE_PADDING_MULTIPLE) - overhead;
  
         SET_NETDEV_DEVTYPE(dev, &device_type);
  
diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c

index 9c6bab9c981f4a41d2bf008e0de739d2bd484c02..4a153894cee259504587f88e90e6ce78bfa68ea3 100644 (file)
--- a/drivers/net/wireguard/receive.c
+++ b/drivers/net/wireguard/receive.c
@@ -118,10 +118,13 @@ static void wg_receive_handshake_packet(struct wg_device *wg,
  
         under_load = skb_queue_len(&wg->incoming_handshakes) >=
                      MAX_QUEUED_INCOMING_HANDSHAKES / 8;
-       if (under_load)
+       if (under_load) {
                 last_under_load = ktime_get_coarse_boottime_ns();
-       else if (last_under_load)
+       } else if (last_under_load) {
                 under_load = !wg_birthdate_has_expired(last_under_load, 1);
+               if (!under_load)
+                       last_under_load = 0;
+       }
         mac_state = wg_cookie_validate_packet(&wg->cookie_checker, skb,
                                               under_load);
         if ((under_load && mac_state == VALID_MAC_WITH_COOKIE) ||
diff --git a/drivers/net/wireguard/send.c b/drivers/net/wireguard/send.c

index c13260563446079ae5d0fca896f1ddf64fdb4c65..7348c10cbae3db54bfcb31f23c2753185735f876 100644 (file)
--- a/drivers/net/wireguard/send.c
+++ b/drivers/net/wireguard/send.c
@@ -143,16 +143,22 @@ static void keep_key_fresh(struct wg_peer *peer)
  
  static unsigned int calculate_skb_padding(struct sk_buff *skb)
  {
+       unsigned int padded_size, last_unit = skb->len;
+
+       if (unlikely(!PACKET_CB(skb)->mtu))
+               return ALIGN(last_unit, MESSAGE_PADDING_MULTIPLE) - last_unit;
+
         /* We do this modulo business with the MTU, just in case the networking
          * layer gives us a packet that's bigger than the MTU. In that case, we
          * wouldn't want the final subtraction to overflow in the case of the
-        * padded_size being clamped.
+        * padded_size being clamped. Fortunately, that's very rarely the case,
+        * so we optimize for that not happening.
          */
-       unsigned int last_unit = skb->len % PACKET_CB(skb)->mtu;
-       unsigned int padded_size = ALIGN(last_unit, MESSAGE_PADDING_MULTIPLE);
+       if (unlikely(last_unit > PACKET_CB(skb)->mtu))
+               last_unit %= PACKET_CB(skb)->mtu;
  
-       if (padded_size > PACKET_CB(skb)->mtu)
-               padded_size = PACKET_CB(skb)->mtu;
+       padded_size = min(PACKET_CB(skb)->mtu,
+                         ALIGN(last_unit, MESSAGE_PADDING_MULTIPLE));
         return padded_size - last_unit;
  }
  
diff --git a/drivers/net/wireguard/socket.c b/drivers/net/wireguard/socket.c

index 262f3b5c819d5a3c7c3b2691ddc510298dc4b31a..b0d6541582d312eeed1a5a25123cb27380cd038f 100644 (file)
--- a/drivers/net/wireguard/socket.c
+++ b/drivers/net/wireguard/socket.c
@@ -432,7 +432,6 @@ void wg_socket_reinit(struct wg_device *wg, struct sock *new4,
                 wg->incoming_port = ntohs(inet_sk(new4)->inet_sport);
         mutex_unlock(&wg->socket_update_lock);
         synchronize_rcu();
-       synchronize_net();
         sock_free(old4);
         sock_free(old6);
  }
diff --git a/drivers/nfc/pn544/i2c.c b/drivers/nfc/pn544/i2c.c

index 720c89d6066ef5567ebb55b72bca566a0b8336e0..4ac8cb262559ce5401dbd6a28e7f09ce0298ded4 100644 (file)
--- a/drivers/nfc/pn544/i2c.c
+++ b/drivers/nfc/pn544/i2c.c
@@ -225,6 +225,7 @@ static void pn544_hci_i2c_platform_init(struct pn544_i2c_phy *phy)
  
  out:
         gpiod_set_value_cansleep(phy->gpiod_en, !phy->en_polarity);
+       usleep_range(10000, 15000);
  }
  
  static void pn544_hci_i2c_enable_mode(struct pn544_i2c_phy *phy, int run_mode)
diff --git a/drivers/nfc/pn544/pn544.c b/drivers/nfc/pn544/pn544.c

index 2b83156efe3fff981404e56330ac377430f0f58c..b788870473e85d7d1c89493fce399ee37993b138 100644 (file)
--- a/drivers/nfc/pn544/pn544.c
+++ b/drivers/nfc/pn544/pn544.c
@@ -682,7 +682,7 @@ static int pn544_hci_tm_send(struct nfc_hci_dev *hdev, struct sk_buff *skb)
  static int pn544_hci_check_presence(struct nfc_hci_dev *hdev,
                                    struct nfc_target *target)
  {
-       pr_debug("supported protocol %d\b", target->supported_protocols);
+       pr_debug("supported protocol %d\n", target->supported_protocols);
         if (target->supported_protocols & (NFC_PROTO_ISO14443_MASK |
                                         NFC_PROTO_ISO14443_B_MASK)) {
                 return nfc_hci_send_cmd(hdev, target->hci_reader_gate,
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c

index 5dc32b72e7faab7875640106514ab88d4976ebf9..a4d8c90ee7cc4b2d0f74e9a9a0de49a5e6916667 100644 (file)
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -66,8 +66,8 @@ MODULE_PARM_DESC(streams, "turn on support for Streams write directives");
   * nvme_reset_wq - hosts nvme reset works
   * nvme_delete_wq - hosts nvme delete works
   *
- * nvme_wq will host works such are scan, aen handling, fw activation,
- * keep-alive error recovery, periodic reconnects etc. nvme_reset_wq
+ * nvme_wq will host works such as scan, aen handling, fw activation,
+ * keep-alive, periodic reconnects etc. nvme_reset_wq
   * runs reset works which also flush works hosted on nvme_wq for
   * serialization purposes. nvme_delete_wq host controller deletion
   * works which flush reset works for serialization.
@@ -976,7 +976,7 @@ static void nvme_keep_alive_end_io(struct request *rq, blk_status_t status)
                 startka = true;
         spin_unlock_irqrestore(&ctrl->lock, flags);
         if (startka)
-               schedule_delayed_work(&ctrl->ka_work, ctrl->kato * HZ);
+               queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ);
  }
  
  static int nvme_keep_alive(struct nvme_ctrl *ctrl)
@@ -1006,7 +1006,7 @@ static void nvme_keep_alive_work(struct work_struct *work)
                 dev_dbg(ctrl->device,
                         "reschedule traffic based keep-alive timer\n");
                 ctrl->comp_seen = false;
-               schedule_delayed_work(&ctrl->ka_work, ctrl->kato * HZ);
+               queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ);
                 return;
         }
  
@@ -1023,7 +1023,7 @@ static void nvme_start_keep_alive(struct nvme_ctrl *ctrl)
         if (unlikely(ctrl->kato == 0))
                 return;
  
-       schedule_delayed_work(&ctrl->ka_work, ctrl->kato * HZ);
+       queue_delayed_work(nvme_wq, &ctrl->ka_work, ctrl->kato * HZ);
  }
  
  void nvme_stop_keep_alive(struct nvme_ctrl *ctrl)
@@ -1165,8 +1165,8 @@ static int nvme_identify_ns(struct nvme_ctrl *ctrl,
  static int nvme_features(struct nvme_ctrl *dev, u8 op, unsigned int fid,
                 unsigned int dword11, void *buffer, size_t buflen, u32 *result)
  {
+       union nvme_result res = { 0 };
         struct nvme_command c;
-       union nvme_result res;
         int ret;
  
         memset(&c, 0, sizeof(c));
@@ -3867,7 +3867,7 @@ static void nvme_get_fw_slot_info(struct nvme_ctrl *ctrl)
         if (!log)
                 return;
  
-       if (nvme_get_log(ctrl, NVME_NSID_ALL, 0, NVME_LOG_FW_SLOT, log,
+       if (nvme_get_log(ctrl, NVME_NSID_ALL, NVME_LOG_FW_SLOT, 0, log,
                         sizeof(*log), 0))
                 dev_warn(ctrl->device, "Get FW SLOT INFO log error\n");
         kfree(log);
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c

index 797c18337d9621b24b63227f214222d20cf2ee3e..a11900cf3a365ba4d48531b17226a27524dc4bc9 100644 (file)
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -715,6 +715,7 @@ int nvme_mpath_init(struct nvme_ctrl *ctrl, struct nvme_id_ctrl *id)
         }
  
         INIT_WORK(&ctrl->ana_work, nvme_ana_work);
+       kfree(ctrl->ana_log_buf);
         ctrl->ana_log_buf = kmalloc(ctrl->ana_log_size, GFP_KERNEL);
         if (!ctrl->ana_log_buf) {
                 error = -ENOMEM;
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c

index da392b50f73e7d28833008d1bf4c6e015309d199..d3f23d6254e47a4d1a2cfd53323a327aca17c013 100644 (file)
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1078,9 +1078,9 @@ static int nvme_poll(struct blk_mq_hw_ctx *hctx)
  
         spin_lock(&nvmeq->cq_poll_lock);
         found = nvme_process_cq(nvmeq, &start, &end, -1);
+       nvme_complete_cqes(nvmeq, start, end);
         spin_unlock(&nvmeq->cq_poll_lock);
  
-       nvme_complete_cqes(nvmeq, start, end);
         return found;
  }
  
@@ -1401,6 +1401,23 @@ static void nvme_disable_admin_queue(struct nvme_dev *dev, bool shutdown)
         nvme_poll_irqdisable(nvmeq, -1);
  }
  
+/*
+ * Called only on a device that has been disabled and after all other threads
+ * that can check this device's completion queues have synced. This is the
+ * last chance for the driver to see a natural completion before
+ * nvme_cancel_request() terminates all incomplete requests.
+ */
+static void nvme_reap_pending_cqes(struct nvme_dev *dev)
+{
+       u16 start, end;
+       int i;
+
+       for (i = dev->ctrl.queue_count - 1; i > 0; i--) {
+               nvme_process_cq(&dev->queues[i], &start, &end, -1);
+               nvme_complete_cqes(&dev->queues[i], start, end);
+       }
+}
+
  static int nvme_cmb_qdepth(struct nvme_dev *dev, int nr_io_queues,
                                 int entry_size)
  {
@@ -2235,11 +2252,6 @@ static bool __nvme_disable_io_queues(struct nvme_dev *dev, u8 opcode)
                 if (timeout == 0)
                         return false;
  
-               /* handle any remaining CQEs */
-               if (opcode == nvme_admin_delete_cq &&
-                   !test_bit(NVMEQ_DELETE_ERROR, &nvmeq->flags))
-                       nvme_poll_irqdisable(nvmeq, -1);
-
                 sent--;
                 if (nr_queues)
                         goto retry;
@@ -2428,6 +2440,7 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
         nvme_suspend_io_queues(dev);
         nvme_suspend_queue(&dev->queues[0]);
         nvme_pci_disable(dev);
+       nvme_reap_pending_cqes(dev);
  
         blk_mq_tagset_busy_iter(&dev->tagset, nvme_cancel_request, &dev->ctrl);
         blk_mq_tagset_busy_iter(&dev->admin_tagset, nvme_cancel_request, &dev->ctrl);
@@ -2734,6 +2747,18 @@ static unsigned long check_vendor_combination_bug(struct pci_dev *pdev)
                     (dmi_match(DMI_BOARD_NAME, "PRIME B350M-A") ||
                      dmi_match(DMI_BOARD_NAME, "PRIME Z370-A")))
                         return NVME_QUIRK_NO_APST;
+       } else if ((pdev->vendor == 0x144d && (pdev->device == 0xa801 ||
+                   pdev->device == 0xa808 || pdev->device == 0xa809)) ||
+                  (pdev->vendor == 0x1e0f && pdev->device == 0x0001)) {
+               /*
+                * Forcing to use host managed nvme power settings for
+                * lowest idle power with quick resume latency on
+                * Samsung and Toshiba SSDs based on suspend behavior
+                * on Coffee Lake board for LENOVO C640
+                */
+               if ((dmi_match(DMI_BOARD_VENDOR, "LENOVO")) &&
+                    dmi_match(DMI_BOARD_NAME, "LNVNB161216"))
+                       return NVME_QUIRK_SIMPLE_SUSPEND;
         }
  
         return 0;
@@ -3096,7 +3121,8 @@ static const struct pci_device_id nvme_id_table[] = {
                 .driver_data = NVME_QUIRK_NO_DEEPEST_PS |
                                 NVME_QUIRK_IGNORE_DEV_SUBNQN, },
         { PCI_DEVICE_CLASS(PCI_CLASS_STORAGE_EXPRESS, 0xffffff) },
-       { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001) },
+       { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2001),
+               .driver_data = NVME_QUIRK_SINGLE_VECTOR },
         { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2003) },
         { PCI_DEVICE(PCI_VENDOR_ID_APPLE, 0x2005),
                 .driver_data = NVME_QUIRK_SINGLE_VECTOR |
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c

index 2a47c6c5007e1280a320f9776afe10005e23b98a..3e85c5cacefd25f742def056fc6d485443a5345f 100644 (file)
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1088,7 +1088,7 @@ static void nvme_rdma_error_recovery(struct nvme_rdma_ctrl *ctrl)
         if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING))
                 return;
  
-       queue_work(nvme_wq, &ctrl->err_work);
+       queue_work(nvme_reset_wq, &ctrl->err_work);
  }
  
  static void nvme_rdma_wr_error(struct ib_cq *cq, struct ib_wc *wc,
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c

index 6d43b23a0fc8bc15d2870da5fa850802ac9b8356..49d4373b84eb392531d7ce6f60099a41c042907d 100644 (file)
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -422,7 +422,7 @@ static void nvme_tcp_error_recovery(struct nvme_ctrl *ctrl)
         if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING))
                 return;
  
-       queue_work(nvme_wq, &to_tcp_ctrl(ctrl)->err_work);
+       queue_work(nvme_reset_wq, &to_tcp_ctrl(ctrl)->err_work);
  }
  
  static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue,
@@ -1054,7 +1054,12 @@ static void nvme_tcp_io_work(struct work_struct *w)
                 } else if (unlikely(result < 0)) {
                         dev_err(queue->ctrl->ctrl.device,
                                 "failed to send request %d\n", result);
-                       if (result != -EPIPE)
+
+                       /*
+                        * Fail the request unless peer closed the connection,
+                        * in which case error recovery flow will complete all.
+                        */
+                       if ((result != -EPIPE) && (result != -ECONNRESET))
                                 nvme_tcp_fail_request(queue->request);
                         nvme_tcp_done_send_req(queue);
                         return;
diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c

index d20aabc26273c4c2ceda98338efb6e439aebb9e5..3a10e678c7f474f631caf7c195bd7889b6f71516 100644 (file)
--- a/drivers/pci/controller/pcie-brcmstb.c
+++ b/drivers/pci/controller/pcie-brcmstb.c
@@ -670,7 +670,7 @@ static inline int brcm_pcie_get_rc_bar2_size_and_offset(struct brcm_pcie *pcie,
          *   outbound memory @ 3GB). So instead it will  start at the 1x
          *   multiple of its size
          */
-       if (!*rc_bar2_size || *rc_bar2_offset % *rc_bar2_size ||
+       if (!*rc_bar2_size || (*rc_bar2_offset & (*rc_bar2_size - 1)) ||
             (*rc_bar2_offset < SZ_4G && *rc_bar2_offset > SZ_2G)) {
                 dev_err(dev, "Invalid rc_bar2_offset/size: size 0x%llx, off 0x%llx\n",
                         *rc_bar2_size, *rc_bar2_offset);
diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c

index d704eccc548f62d2a1317ecb0e972893eb49c1e0..f01a57e5a5f3502320345d582bfaf2c887f64c14 100644 (file)
--- a/drivers/perf/arm_smmuv3_pmu.c
+++ b/drivers/perf/arm_smmuv3_pmu.c
@@ -771,7 +771,7 @@ static int smmu_pmu_probe(struct platform_device *pdev)
                 smmu_pmu->reloc_base = smmu_pmu->reg_base;
         }
  
-       irq = platform_get_irq(pdev, 0);
+       irq = platform_get_irq_optional(pdev, 0);
         if (irq > 0)
                 smmu_pmu->irq = irq;
  
diff --git a/drivers/platform/chrome/wilco_ec/properties.c b/drivers/platform/chrome/wilco_ec/properties.c

index e69682c95ea2d7f3a0d678b3c7bd3384aa905a7b..62f27610dd33a18cb8e67922489e6811c284eae7 100644 (file)
--- a/drivers/platform/chrome/wilco_ec/properties.c
+++ b/drivers/platform/chrome/wilco_ec/properties.c
@@ -5,7 +5,7 @@
  
  #include <linux/platform_data/wilco-ec.h>
  #include <linux/string.h>
-#include <linux/unaligned/le_memmove.h>
+#include <asm/unaligned.h>
  
  /* Operation code; what the EC should do with the property */
  enum ec_property_op {
diff --git a/drivers/s390/cio/blacklist.c b/drivers/s390/cio/blacklist.c

index da642e811f7f06853ed77defd599afa0eab98a37..4dd2eb6348569965dc4df362e8a6a6f689976f3a 100644 (file)
--- a/drivers/s390/cio/blacklist.c
+++ b/drivers/s390/cio/blacklist.c
@@ -303,8 +303,10 @@ static void *
  cio_ignore_proc_seq_next(struct seq_file *s, void *it, loff_t *offset)
  {
         struct ccwdev_iter *iter;
+       loff_t p = *offset;
  
-       if (*offset >= (__MAX_SUBCHANNEL + 1) * (__MAX_SSID + 1))
+       (*offset)++;
+       if (p >= (__MAX_SUBCHANNEL + 1) * (__MAX_SSID + 1))
                 return NULL;
         iter = it;
         if (iter->devno == __MAX_SUBCHANNEL) {
@@ -314,7 +316,6 @@ cio_ignore_proc_seq_next(struct seq_file *s, void *it, loff_t *offset)
                         return NULL;
         } else
                 iter->devno++;
-       (*offset)++;
         return iter;
  }
  
diff --git a/drivers/s390/cio/chp.c b/drivers/s390/cio/chp.c

index 51038ec309c12ef3f3eae351d332ed7a13694eda..dfcbe54591fbda24eb32a4ead5c7c36ec7fd8162 100644 (file)
--- a/drivers/s390/cio/chp.c
+++ b/drivers/s390/cio/chp.c
@@ -135,7 +135,7 @@ static ssize_t chp_measurement_chars_read(struct file *filp,
         struct channel_path *chp;
         struct device *device;
  
-       device = container_of(kobj, struct device, kobj);
+       device = kobj_to_dev(kobj);
         chp = to_channelpath(device);
         if (chp->cmg == -1)
                 return 0;
@@ -184,7 +184,7 @@ static ssize_t chp_measurement_read(struct file *filp, struct kobject *kobj,
         struct device *device;
         unsigned int size;
  
-       device = container_of(kobj, struct device, kobj);
+       device = kobj_to_dev(kobj);
         chp = to_channelpath(device);
         css = to_css(chp->dev.parent);
  
diff --git a/drivers/s390/cio/qdio.h b/drivers/s390/cio/qdio.h

index 4b0798472643105aca0942671e0badc225bb549d..ff74eb5fce5024512e438e99cefa7bf7a5f31a27 100644 (file)
--- a/drivers/s390/cio/qdio.h
+++ b/drivers/s390/cio/qdio.h
@@ -182,11 +182,9 @@ enum qdio_queue_irq_states {
  };
  
  struct qdio_input_q {
-       /* input buffer acknowledgement flag */
-       int polling;
         /* first ACK'ed buffer */
         int ack_start;
-       /* how much sbals are acknowledged with qebsm */
+       /* how many SBALs are acknowledged */
         int ack_count;
         /* last time of noticing incoming data */
         u64 timestamp;
diff --git a/drivers/s390/cio/qdio_debug.c b/drivers/s390/cio/qdio_debug.c

index 35410e6eda2eaaf0de6bf16a9d942cf870eec940..9c0370b27426213fbec0da3732ee872323dcc3b5 100644 (file)
--- a/drivers/s390/cio/qdio_debug.c
+++ b/drivers/s390/cio/qdio_debug.c
@@ -124,9 +124,8 @@ static int qstat_show(struct seq_file *m, void *v)
         seq_printf(m, "nr_used: %d  ftc: %d\n",
                    atomic_read(&q->nr_buf_used), q->first_to_check);
         if (q->is_input_q) {
-               seq_printf(m, "polling: %d  ack start: %d  ack count: %d\n",
-                          q->u.in.polling, q->u.in.ack_start,
-                          q->u.in.ack_count);
+               seq_printf(m, "ack start: %d  ack count: %d\n",
+                          q->u.in.ack_start, q->u.in.ack_count);
                 seq_printf(m, "DSCI: %x   IRQs disabled: %u\n",
                            *(u8 *)q->irq_ptr->dsci,
                            test_bit(QDIO_QUEUE_IRQS_DISABLED,
diff --git a/drivers/s390/cio/qdio_main.c b/drivers/s390/cio/qdio_main.c

index f8b897b7e78b4204ec21f486e765e2f3ae19db84..3475317c42e5ace277cb41019a4c540fd41d91b1 100644 (file)
--- a/drivers/s390/cio/qdio_main.c
+++ b/drivers/s390/cio/qdio_main.c
@@ -393,19 +393,15 @@ int debug_get_buf_state(struct qdio_q *q, unsigned int bufnr,
  
  static inline void qdio_stop_polling(struct qdio_q *q)
  {
-       if (!q->u.in.polling)
+       if (!q->u.in.ack_count)
                 return;
  
-       q->u.in.polling = 0;
         qperf_inc(q, stop_polling);
  
         /* show the card that we are not polling anymore */
-       if (is_qebsm(q)) {
-               set_buf_states(q, q->u.in.ack_start, SLSB_P_INPUT_NOT_INIT,
-                              q->u.in.ack_count);
-               q->u.in.ack_count = 0;
-       } else
-               set_buf_state(q, q->u.in.ack_start, SLSB_P_INPUT_NOT_INIT);
+       set_buf_states(q, q->u.in.ack_start, SLSB_P_INPUT_NOT_INIT,
+                      q->u.in.ack_count);
+       q->u.in.ack_count = 0;
  }
  
  static inline void account_sbals(struct qdio_q *q, unsigned int count)
@@ -451,8 +447,7 @@ static inline void inbound_primed(struct qdio_q *q, unsigned int start,
  
         /* for QEBSM the ACK was already set by EQBS */
         if (is_qebsm(q)) {
-               if (!q->u.in.polling) {
-                       q->u.in.polling = 1;
+               if (!q->u.in.ack_count) {
                         q->u.in.ack_count = count;
                         q->u.in.ack_start = start;
                         return;
@@ -471,12 +466,12 @@ static inline void inbound_primed(struct qdio_q *q, unsigned int start,
          * or by the next inbound run.
          */
         new = add_buf(start, count - 1);
-       if (q->u.in.polling) {
+       if (q->u.in.ack_count) {
                 /* reset the previous ACK but first set the new one */
                 set_buf_state(q, new, SLSB_P_INPUT_ACK);
                 set_buf_state(q, q->u.in.ack_start, SLSB_P_INPUT_NOT_INIT);
         } else {
-               q->u.in.polling = 1;
+               q->u.in.ack_count = 1;
                 set_buf_state(q, new, SLSB_P_INPUT_ACK);
         }
  
@@ -1479,13 +1474,12 @@ static int handle_inbound(struct qdio_q *q, unsigned int callflags,
  
         qperf_inc(q, inbound_call);
  
-       if (!q->u.in.polling)
+       if (!q->u.in.ack_count)
                 goto set;
  
         /* protect against stop polling setting an ACK for an emptied slsb */
         if (count == QDIO_MAX_BUFFERS_PER_Q) {
                 /* overwriting everything, just delete polling status */
-               q->u.in.polling = 0;
                 q->u.in.ack_count = 0;
                 goto set;
         } else if (buf_in_between(q->u.in.ack_start, bufnr, count)) {
@@ -1495,15 +1489,14 @@ static int handle_inbound(struct qdio_q *q, unsigned int callflags,
                         diff = sub_buf(diff, q->u.in.ack_start);
                         q->u.in.ack_count -= diff;
                         if (q->u.in.ack_count <= 0) {
-                               q->u.in.polling = 0;
                                 q->u.in.ack_count = 0;
                                 goto set;
                         }
                         q->u.in.ack_start = add_buf(q->u.in.ack_start, diff);
+               } else {
+                       /* the only ACK will be deleted */
+                       q->u.in.ack_count = 0;
                 }
-               else
-                       /* the only ACK will be deleted, so stop polling */
-                       q->u.in.polling = 0;
         }
  
  set:
diff --git a/drivers/s390/cio/qdio_setup.c b/drivers/s390/cio/qdio_setup.c

index dc430bd86ade93039579f2d0d32e370857dfab40..e115623b86b298267e34d09d4da37366421d321c 100644 (file)
--- a/drivers/s390/cio/qdio_setup.c
+++ b/drivers/s390/cio/qdio_setup.c
@@ -8,6 +8,7 @@
  #include <linux/kernel.h>
  #include <linux/slab.h>
  #include <linux/export.h>
+#include <linux/io.h>
  #include <asm/qdio.h>
  
  #include "cio.h"
@@ -205,7 +206,7 @@ static void setup_storage_lists(struct qdio_q *q, struct qdio_irq *irq_ptr,
  
         /* fill in sl */
         for (j = 0; j < QDIO_MAX_BUFFERS_PER_Q; j++)
-               q->sl->element[j].sbal = (unsigned long)q->sbal[j];
+               q->sl->element[j].sbal = virt_to_phys(q->sbal[j]);
  }
  
  static void setup_queues(struct qdio_irq *irq_ptr,
@@ -536,7 +537,7 @@ void qdio_print_subchannel_info(struct qdio_irq *irq_ptr,
  int qdio_enable_async_operation(struct qdio_output_q *outq)
  {
         outq->aobs = kcalloc(QDIO_MAX_BUFFERS_PER_Q, sizeof(struct qaob *),
-                            GFP_ATOMIC);
+                            GFP_KERNEL);
         if (!outq->aobs) {
                 outq->use_cq = 0;
                 return -ENOMEM;
diff --git a/drivers/s390/cio/vfio_ccw_trace.h b/drivers/s390/cio/vfio_ccw_trace.h

index 30162a318a8a12e4860919e7c6a50b040f265379..f5d31887d41341a8c99f8d301b9e25afc21ec029 100644 (file)
--- a/drivers/s390/cio/vfio_ccw_trace.h
+++ b/drivers/s390/cio/vfio_ccw_trace.h
@@ -1,5 +1,5 @@
-/* SPDX-License-Identifier: GPL-2.0
- * Tracepoints for vfio_ccw driver
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Tracepoints for vfio_ccw driver
   *
   * Copyright IBM Corp. 2018
   *
diff --git a/drivers/s390/crypto/ap_bus.h b/drivers/s390/crypto/ap_bus.h

index bb35ba4a8d243955545596908170fbdd10620400..4348fdff1c61ec95f1e81526361fee3bd6b550b5 100644 (file)
--- a/drivers/s390/crypto/ap_bus.h
+++ b/drivers/s390/crypto/ap_bus.h
@@ -162,7 +162,7 @@ struct ap_card {
         unsigned int functions;         /* AP device function bitfield. */
         int queue_depth;                /* AP queue depth.*/
         int id;                         /* AP card number. */
-       atomic_t total_request_count;   /* # requests ever for this AP device.*/
+       atomic64_t total_request_count; /* # requests ever for this AP device.*/
  };
  
  #define to_ap_card(x) container_of((x), struct ap_card, ap_dev.device)
@@ -179,7 +179,7 @@ struct ap_queue {
         enum ap_state state;            /* State of the AP device. */
         int pendingq_count;             /* # requests on pendingq list. */
         int requestq_count;             /* # requests on requestq list. */
-       int total_request_count;        /* # requests ever for this AP device.*/
+       u64 total_request_count;        /* # requests ever for this AP device.*/
         int request_timeout;            /* Request timeout in jiffies. */
         struct timer_list timeout;      /* Timer for request timeouts. */
         struct list_head pendingq;      /* List of message sent to AP queue. */
diff --git a/drivers/s390/crypto/ap_card.c b/drivers/s390/crypto/ap_card.c

index 63b4cc6cd7e596f7b51681391eb17ddffb6d0512..e85bfca1ed163575989cae3e8120b235369ba0b4 100644 (file)
--- a/drivers/s390/crypto/ap_card.c
+++ b/drivers/s390/crypto/ap_card.c
@@ -63,13 +63,13 @@ static ssize_t request_count_show(struct device *dev,
                                   char *buf)
  {
         struct ap_card *ac = to_ap_card(dev);
-       unsigned int req_cnt;
+       u64 req_cnt;
  
         req_cnt = 0;
         spin_lock_bh(&ap_list_lock);
-       req_cnt = atomic_read(&ac->total_request_count);
+       req_cnt = atomic64_read(&ac->total_request_count);
         spin_unlock_bh(&ap_list_lock);
-       return snprintf(buf, PAGE_SIZE, "%d\n", req_cnt);
+       return snprintf(buf, PAGE_SIZE, "%llu\n", req_cnt);
  }
  
  static ssize_t request_count_store(struct device *dev,
@@ -83,7 +83,7 @@ static ssize_t request_count_store(struct device *dev,
         for_each_ap_queue(aq, ac)
                 aq->total_request_count = 0;
         spin_unlock_bh(&ap_list_lock);
-       atomic_set(&ac->total_request_count, 0);
+       atomic64_set(&ac->total_request_count, 0);
  
         return count;
  }
diff --git a/drivers/s390/crypto/ap_queue.c b/drivers/s390/crypto/ap_queue.c

index 37c3bdc3642dc6756dbcf10a6a34f2d9974fb930..a317ab48493203e8082b0acc1a57fdd8fe62524b 100644 (file)
--- a/drivers/s390/crypto/ap_queue.c
+++ b/drivers/s390/crypto/ap_queue.c
@@ -479,12 +479,12 @@ static ssize_t request_count_show(struct device *dev,
                                   char *buf)
  {
         struct ap_queue *aq = to_ap_queue(dev);
-       unsigned int req_cnt;
+       u64 req_cnt;
  
         spin_lock_bh(&aq->lock);
         req_cnt = aq->total_request_count;
         spin_unlock_bh(&aq->lock);
-       return snprintf(buf, PAGE_SIZE, "%d\n", req_cnt);
+       return snprintf(buf, PAGE_SIZE, "%llu\n", req_cnt);
  }
  
  static ssize_t request_count_store(struct device *dev,
@@ -676,7 +676,7 @@ void ap_queue_message(struct ap_queue *aq, struct ap_message *ap_msg)
         list_add_tail(&ap_msg->list, &aq->requestq);
         aq->requestq_count++;
         aq->total_request_count++;
-       atomic_inc(&aq->card->total_request_count);
+       atomic64_inc(&aq->card->total_request_count);
         /* Send/receive as many request from the queue as possible. */
         ap_wait(ap_sm_event_loop(aq, AP_EVENT_POLL));
         spin_unlock_bh(&aq->lock);
diff --git a/drivers/s390/crypto/pkey_api.c b/drivers/s390/crypto/pkey_api.c

index 71dae64ba99491c66bd5897c662d1d5a3910b7c9..2f33c5fcf676d1241139067d1336f78289e78702 100644 (file)
--- a/drivers/s390/crypto/pkey_api.c
+++ b/drivers/s390/crypto/pkey_api.c
@@ -994,7 +994,7 @@ static long pkey_unlocked_ioctl(struct file *filp, unsigned int cmd,
                         return -EFAULT;
                 rc = cca_sec2protkey(ksp.cardnr, ksp.domain,
                                      ksp.seckey.seckey, ksp.protkey.protkey,
-                                    NULL, &ksp.protkey.type);
+                                    &ksp.protkey.len, &ksp.protkey.type);
                 DEBUG_DBG("%s cca_sec2protkey()=%d\n", __func__, rc);
                 if (rc)
                         break;
diff --git a/drivers/s390/crypto/zcrypt_api.c b/drivers/s390/crypto/zcrypt_api.c

index a42257d6c79e8eb60bc94de510229e207535e0b7..56a405dce8bcf32c468443d5249ce466285d999c 100644 (file)
--- a/drivers/s390/crypto/zcrypt_api.c
+++ b/drivers/s390/crypto/zcrypt_api.c
@@ -606,8 +606,8 @@ static inline bool zcrypt_card_compare(struct zcrypt_card *zc,
         weight += atomic_read(&zc->load);
         pref_weight += atomic_read(&pref_zc->load);
         if (weight == pref_weight)
-               return atomic_read(&zc->card->total_request_count) >
-                       atomic_read(&pref_zc->card->total_request_count);
+               return atomic64_read(&zc->card->total_request_count) >
+                       atomic64_read(&pref_zc->card->total_request_count);
         return weight > pref_weight;
  }
  
@@ -1226,11 +1226,12 @@ static void zcrypt_qdepth_mask(char qdepth[], size_t max_adapters)
         spin_unlock(&zcrypt_list_lock);
  }
  
-static void zcrypt_perdev_reqcnt(int reqcnt[], size_t max_adapters)
+static void zcrypt_perdev_reqcnt(u32 reqcnt[], size_t max_adapters)
  {
         struct zcrypt_card *zc;
         struct zcrypt_queue *zq;
         int card;
+       u64 cnt;
  
         memset(reqcnt, 0, sizeof(int) * max_adapters);
         spin_lock(&zcrypt_list_lock);
@@ -1242,8 +1243,9 @@ static void zcrypt_perdev_reqcnt(int reqcnt[], size_t max_adapters)
                             || card >= max_adapters)
                                 continue;
                         spin_lock(&zq->queue->lock);
-                       reqcnt[card] = zq->queue->total_request_count;
+                       cnt = zq->queue->total_request_count;
                         spin_unlock(&zq->queue->lock);
+                       reqcnt[card] = (cnt < UINT_MAX) ? (u32) cnt : UINT_MAX;
                 }
         }
         local_bh_enable();
@@ -1421,9 +1423,9 @@ static long zcrypt_unlocked_ioctl(struct file *filp, unsigned int cmd,
                 return 0;
         }
         case ZCRYPT_PERDEV_REQCNT: {
-               int *reqcnt;
+               u32 *reqcnt;
  
-               reqcnt = kcalloc(AP_DEVICES, sizeof(int), GFP_KERNEL);
+               reqcnt = kcalloc(AP_DEVICES, sizeof(u32), GFP_KERNEL);
                 if (!reqcnt)
                         return -ENOMEM;
                 zcrypt_perdev_reqcnt(reqcnt, AP_DEVICES);
@@ -1480,7 +1482,7 @@ static long zcrypt_unlocked_ioctl(struct file *filp, unsigned int cmd,
         }
         case Z90STAT_PERDEV_REQCNT: {
                 /* the old ioctl supports only 64 adapters */
-               int reqcnt[MAX_ZDEV_CARDIDS];
+               u32 reqcnt[MAX_ZDEV_CARDIDS];
  
                 zcrypt_perdev_reqcnt(reqcnt, MAX_ZDEV_CARDIDS);
                 if (copy_to_user((int __user *) arg, reqcnt, sizeof(reqcnt)))
diff --git a/drivers/s390/crypto/zcrypt_ep11misc.c b/drivers/s390/crypto/zcrypt_ep11misc.c

index d4caf46ff9df155963bf7d01a431212e70fd53d1..2afe2153b34e32e9bc9b25b8a8026d0a0db01be2 100644 (file)
--- a/drivers/s390/crypto/zcrypt_ep11misc.c
+++ b/drivers/s390/crypto/zcrypt_ep11misc.c
@@ -887,7 +887,7 @@ static int ep11_unwrapkey(u16 card, u16 domain,
         /* empty pin tag */
         *p++ = 0x04;
         *p++ = 0;
-       /* encrytped key value tag and bytes */
+       /* encrypted key value tag and bytes */
         p += asn1tag_write(p, 0x04, enckey, enckeysize);
  
         /* reply cprb and payload */
@@ -1095,7 +1095,7 @@ int ep11_clr2keyblob(u16 card, u16 domain, u32 keybitsize, u32 keygenflags,
  
         /* Step 1: generate AES 256 bit random kek key */
         rc = ep11_genaeskey(card, domain, 256,
-                           0x00006c00, /* EN/DECRYTP, WRAP/UNWRAP */
+                           0x00006c00, /* EN/DECRYPT, WRAP/UNWRAP */
                             kek, &keklen);
         if (rc) {
                 DEBUG_ERR(
diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c

index 9639938581f58e3b0c4151260ce34046b49eeced..8ca85c8a01a15cd000fd1b8b2970ce28207256c3 100644 (file)
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -1128,9 +1128,10 @@ static void qeth_clear_output_buffer(struct qeth_qdio_out_q *queue,
         qeth_tx_complete_buf(buf, error, budget);
  
         for (i = 0; i < queue->max_elements; ++i) {
-               if (buf->buffer->element[i].addr && buf->is_header[i])
-                       kmem_cache_free(qeth_core_header_cache,
-                               buf->buffer->element[i].addr);
+               void *data = phys_to_virt(buf->buffer->element[i].addr);
+
+               if (data && buf->is_header[i])
+                       kmem_cache_free(qeth_core_header_cache, data);
                 buf->is_header[i] = 0;
         }
  
@@ -2641,7 +2642,8 @@ static int qeth_init_input_buffer(struct qeth_card *card,
         buf->pool_entry = pool_entry;
         for (i = 0; i < QETH_MAX_BUFFER_ELEMENTS(card); ++i) {
                 buf->buffer->element[i].length = PAGE_SIZE;
-               buf->buffer->element[i].addr =  pool_entry->elements[i];
+               buf->buffer->element[i].addr =
+                       virt_to_phys(pool_entry->elements[i]);
                 if (i == QETH_MAX_BUFFER_ELEMENTS(card) - 1)
                         buf->buffer->element[i].eflags = SBAL_EFLAGS_LAST_ENTRY;
                 else
@@ -3459,9 +3461,8 @@ static void qeth_qdio_cq_handler(struct qeth_card *card, unsigned int qdio_err,
  
                 while ((e < QDIO_MAX_ELEMENTS_PER_BUFFER) &&
                        buffer->element[e].addr) {
-                       unsigned long phys_aob_addr;
+                       unsigned long phys_aob_addr = buffer->element[e].addr;
  
-                       phys_aob_addr = (unsigned long) buffer->element[e].addr;
                         qeth_qdio_handle_aob(card, phys_aob_addr);
                         ++e;
                 }
@@ -3750,7 +3751,7 @@ static unsigned int __qeth_fill_buffer(struct sk_buff *skb,
                 elem_length = min_t(unsigned int, length,
                                     PAGE_SIZE - offset_in_page(data));
  
-               buffer->element[element].addr = data;
+               buffer->element[element].addr = virt_to_phys(data);
                 buffer->element[element].length = elem_length;
                 length -= elem_length;
                 if (is_first_elem) {
@@ -3780,7 +3781,7 @@ static unsigned int __qeth_fill_buffer(struct sk_buff *skb,
                         elem_length = min_t(unsigned int, length,
                                             PAGE_SIZE - offset_in_page(data));
  
-                       buffer->element[element].addr = data;
+                       buffer->element[element].addr = virt_to_phys(data);
                         buffer->element[element].length = elem_length;
                         buffer->element[element].eflags =
                                 SBAL_EFLAGS_MIDDLE_FRAG;
@@ -3820,7 +3821,7 @@ static unsigned int qeth_fill_buffer(struct qeth_qdio_out_buffer *buf,
                 int element = buf->next_element_to_fill;
                 is_first_elem = false;
  
-               buffer->element[element].addr = hdr;
+               buffer->element[element].addr = virt_to_phys(hdr);
                 buffer->element[element].length = hd_len;
                 buffer->element[element].eflags = SBAL_EFLAGS_FIRST_FRAG;
                 /* remember to free cache-allocated qeth_hdr: */
@@ -4746,10 +4747,10 @@ static void qeth_qdio_establish_cq(struct qeth_card *card,
         if (card->options.cq == QETH_CQ_ENABLED) {
                 int offset = QDIO_MAX_BUFFERS_PER_Q *
                              (card->qdio.no_in_queues - 1);
-               for (i = 0; i < QDIO_MAX_BUFFERS_PER_Q; ++i) {
-                       in_sbal_ptrs[offset + i] = (struct qdio_buffer *)
-                               virt_to_phys(card->qdio.c_q->bufs[i].buffer);
-               }
+
+               for (i = 0; i < QDIO_MAX_BUFFERS_PER_Q; i++)
+                       in_sbal_ptrs[offset + i] =
+                               card->qdio.c_q->bufs[i].buffer;
  
                 queue_start_poll[card->qdio.no_in_queues - 1] = NULL;
         }
@@ -4783,10 +4784,9 @@ static int qeth_qdio_establish(struct qeth_card *card)
                 rc = -ENOMEM;
                 goto out_free_qib_param;
         }
-       for (i = 0; i < QDIO_MAX_BUFFERS_PER_Q; ++i) {
-               in_sbal_ptrs[i] = (struct qdio_buffer *)
-                       virt_to_phys(card->qdio.in_q->bufs[i].buffer);
-       }
+
+       for (i = 0; i < QDIO_MAX_BUFFERS_PER_Q; i++)
+               in_sbal_ptrs[i] = card->qdio.in_q->bufs[i].buffer;
  
         queue_start_poll = kcalloc(card->qdio.no_in_queues, sizeof(void *),
                                    GFP_KERNEL);
@@ -4807,11 +4807,11 @@ static int qeth_qdio_establish(struct qeth_card *card)
                 rc = -ENOMEM;
                 goto out_free_queue_start_poll;
         }
+
         for (i = 0, k = 0; i < card->qdio.no_out_queues; ++i)
-               for (j = 0; j < QDIO_MAX_BUFFERS_PER_Q; ++j, ++k) {
-                       out_sbal_ptrs[k] = (struct qdio_buffer *)virt_to_phys(
-                               card->qdio.out_qs[i]->bufs[j]->buffer);
-               }
+               for (j = 0; j < QDIO_MAX_BUFFERS_PER_Q; j++, k++)
+                       out_sbal_ptrs[k] =
+                               card->qdio.out_qs[i]->bufs[j]->buffer;
  
         memset(&init_data, 0, sizeof(struct qdio_initialize));
         init_data.cdev                   = CARD_DDEV(card);
@@ -5289,7 +5289,7 @@ next_packet:
                 offset = 0;
         }
  
-       hdr = element->addr + offset;
+       hdr = phys_to_virt(element->addr) + offset;
         offset += sizeof(*hdr);
         skb = NULL;
  
@@ -5344,7 +5344,7 @@ next_packet:
         }
  
         use_rx_sg = (card->options.cq == QETH_CQ_ENABLED) ||
-                   ((skb_len >= card->options.rx_sg_cb) &&
+                   (skb_len > card->options.rx_sg_cb &&
                      !atomic_read(&card->force_alloc_skb) &&
                      !IS_OSN(card));
  
@@ -5388,7 +5388,7 @@ use_skb:
  walk_packet:
         while (skb_len) {
                 int data_len = min(skb_len, (int)(element->length - offset));
-               char *data = element->addr + offset;
+               char *data = phys_to_virt(element->addr) + offset;
  
                 skb_len -= data_len;
                 offset += data_len;
@@ -5447,7 +5447,6 @@ static int qeth_extract_skbs(struct qeth_card *card, int budget,
  {
         int work_done = 0;
  
-       WARN_ON_ONCE(!budget);
         *done = false;
  
         while (budget) {
diff --git a/drivers/s390/net/qeth_l2_main.c b/drivers/s390/net/qeth_l2_main.c

index 692bd26234018f2a260fa16d3085616390a6fab9..9972d96820f3ffbf97513419a19397f75631e8bf 100644 (file)
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -1707,15 +1707,14 @@ int qeth_l2_vnicc_set_state(struct qeth_card *card, u32 vnicc, bool state)
  
         QETH_CARD_TEXT(card, 2, "vniccsch");
  
-       /* do not change anything if BridgePort is enabled */
-       if (qeth_bridgeport_is_in_use(card))
-               return -EBUSY;
-
         /* check if characteristic and enable/disable are supported */
         if (!(card->options.vnicc.sup_chars & vnicc) ||
             !(card->options.vnicc.set_char_sup & vnicc))
                 return -EOPNOTSUPP;
  
+       if (qeth_bridgeport_is_in_use(card))
+               return -EBUSY;
+
         /* set enable/disable command and store wanted characteristic */
         if (state) {
                 cmd = IPA_VNICC_ENABLE;
@@ -1761,14 +1760,13 @@ int qeth_l2_vnicc_get_state(struct qeth_card *card, u32 vnicc, bool *state)
  
         QETH_CARD_TEXT(card, 2, "vniccgch");
  
-       /* do not get anything if BridgePort is enabled */
-       if (qeth_bridgeport_is_in_use(card))
-               return -EBUSY;
-
         /* check if characteristic is supported */
         if (!(card->options.vnicc.sup_chars & vnicc))
                 return -EOPNOTSUPP;
  
+       if (qeth_bridgeport_is_in_use(card))
+               return -EBUSY;
+
         /* if card is ready, query current VNICC state */
         if (qeth_card_hw_is_reachable(card))
                 rc = qeth_l2_vnicc_query_chars(card);
@@ -1786,15 +1784,14 @@ int qeth_l2_vnicc_set_timeout(struct qeth_card *card, u32 timeout)
  
         QETH_CARD_TEXT(card, 2, "vniccsto");
  
-       /* do not change anything if BridgePort is enabled */
-       if (qeth_bridgeport_is_in_use(card))
-               return -EBUSY;
-
         /* check if characteristic and set_timeout are supported */
         if (!(card->options.vnicc.sup_chars & QETH_VNICC_LEARNING) ||
             !(card->options.vnicc.getset_timeout_sup & QETH_VNICC_LEARNING))
                 return -EOPNOTSUPP;
  
+       if (qeth_bridgeport_is_in_use(card))
+               return -EBUSY;
+
         /* do we need to do anything? */
         if (card->options.vnicc.learning_timeout == timeout)
                 return rc;
@@ -1823,14 +1820,14 @@ int qeth_l2_vnicc_get_timeout(struct qeth_card *card, u32 *timeout)
  
         QETH_CARD_TEXT(card, 2, "vniccgto");
  
-       /* do not get anything if BridgePort is enabled */
-       if (qeth_bridgeport_is_in_use(card))
-               return -EBUSY;
-
         /* check if characteristic and get_timeout are supported */
         if (!(card->options.vnicc.sup_chars & QETH_VNICC_LEARNING) ||
             !(card->options.vnicc.getset_timeout_sup & QETH_VNICC_LEARNING))
                 return -EOPNOTSUPP;
+
+       if (qeth_bridgeport_is_in_use(card))
+               return -EBUSY;
+
         /* if card is ready, get timeout. Otherwise, just return stored value */
         *timeout = card->options.vnicc.learning_timeout;
         if (qeth_card_hw_is_reachable(card))
diff --git a/drivers/s390/scsi/zfcp_fsf.c b/drivers/s390/scsi/zfcp_fsf.c

index 223a805f0b0bf4bcef3da55d73c86f9705c72463..cae9b7ff79b086e565d63ee8c74122a9dec8a377 100644 (file)
--- a/drivers/s390/scsi/zfcp_fsf.c
+++ b/drivers/s390/scsi/zfcp_fsf.c
@@ -2510,7 +2510,7 @@ void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
         for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
  
                 sbale = &sbal->element[idx];
-               req_id = (unsigned long) sbale->addr;
+               req_id = sbale->addr;
                 fsf_req = zfcp_reqlist_find_rm(adapter->req_list, req_id);
  
                 if (!fsf_req) {
diff --git a/drivers/s390/scsi/zfcp_fsf.h b/drivers/s390/scsi/zfcp_fsf.h

index 2b1e4da1944f572acc88a29726245e9934d00d72..4bfb79f20588b24f00c81e9d094b75a507bb7068 100644 (file)
--- a/drivers/s390/scsi/zfcp_fsf.h
+++ b/drivers/s390/scsi/zfcp_fsf.h
@@ -410,7 +410,7 @@ struct fsf_qtcb_bottom_port {
         u8 cb_util;
         u8 a_util;
         u8 res2;
-       u16 temperature;
+       s16 temperature;
         u16 vcc;
         u16 tx_bias;
         u16 tx_power;
diff --git a/drivers/s390/scsi/zfcp_qdio.c b/drivers/s390/scsi/zfcp_qdio.c

index 661436a92f8e69903200a568304276b3f3b1b12f..f0d6296e673b46ae27c24ae5dd873f466c58e2b4 100644 (file)
--- a/drivers/s390/scsi/zfcp_qdio.c
+++ b/drivers/s390/scsi/zfcp_qdio.c
@@ -98,7 +98,7 @@ static void zfcp_qdio_int_resp(struct ccw_device *cdev, unsigned int qdio_err,
                         memset(pl, 0,
                                ZFCP_QDIO_MAX_SBALS_PER_REQ * sizeof(void *));
                         sbale = qdio->res_q[idx]->element;
-                       req_id = (u64) sbale->addr;
+                       req_id = sbale->addr;
                         scount = min(sbale->scount + 1,
                                      ZFCP_QDIO_MAX_SBALS_PER_REQ + 1);
                                      /* incl. signaling SBAL */
@@ -199,7 +199,7 @@ int zfcp_qdio_sbals_from_sg(struct zfcp_qdio *qdio, struct zfcp_qdio_req *q_req,
                                              q_req->sbal_number);
                         return -EINVAL;
                 }
-               sbale->addr = sg_virt(sg);
+               sbale->addr = sg_phys(sg);
                 sbale->length = sg->length;
         }
         return 0;
@@ -418,7 +418,7 @@ int zfcp_qdio_open(struct zfcp_qdio *qdio)
                 sbale->length = 0;
                 sbale->eflags = SBAL_EFLAGS_LAST_ENTRY;
                 sbale->sflags = 0;
-               sbale->addr = NULL;
+               sbale->addr = 0;
         }
  
         if (do_QDIO(cdev, QDIO_FLAG_SYNC_INPUT, 0, 0, QDIO_MAX_BUFFERS_PER_Q))
diff --git a/drivers/s390/scsi/zfcp_qdio.h b/drivers/s390/scsi/zfcp_qdio.h

index 2a816a37b3c018399157926ef8ffbcfe94d52a7a..6b43d6b254bef2e34757a4bafe92bc3d0a19129a 100644 (file)
--- a/drivers/s390/scsi/zfcp_qdio.h
+++ b/drivers/s390/scsi/zfcp_qdio.h
@@ -122,14 +122,14 @@ void zfcp_qdio_req_init(struct zfcp_qdio *qdio, struct zfcp_qdio_req *q_req,
                                         % QDIO_MAX_BUFFERS_PER_Q;
  
         sbale = zfcp_qdio_sbale_req(qdio, q_req);
-       sbale->addr = (void *) req_id;
+       sbale->addr = req_id;
         sbale->eflags = 0;
         sbale->sflags = SBAL_SFLAGS0_COMMAND | sbtype;
  
         if (unlikely(!data))
                 return;
         sbale++;
-       sbale->addr = data;
+       sbale->addr = virt_to_phys(data);
         sbale->length = len;
  }
  
@@ -152,7 +152,7 @@ void zfcp_qdio_fill_next(struct zfcp_qdio *qdio, struct zfcp_qdio_req *q_req,
         BUG_ON(q_req->sbale_curr == qdio->max_sbale_per_sbal - 1);
         q_req->sbale_curr++;
         sbale = zfcp_qdio_sbale_curr(qdio, q_req);
-       sbale->addr = data;
+       sbale->addr = virt_to_phys(data);
         sbale->length = len;
  }
  
diff --git a/drivers/s390/scsi/zfcp_sysfs.c b/drivers/s390/scsi/zfcp_sysfs.c

index 494b9fe9cc944c55011f16ccf04bea12d92f40a9..a711a0d151002d7b16212d15130a1362b1ff7a13 100644 (file)
--- a/drivers/s390/scsi/zfcp_sysfs.c
+++ b/drivers/s390/scsi/zfcp_sysfs.c
@@ -800,7 +800,7 @@ static ZFCP_DEV_ATTR(adapter_diag, b2b_credit, 0400,
         static ZFCP_DEV_ATTR(adapter_diag_sfp, _name, 0400,                    \
                              zfcp_sysfs_adapter_diag_sfp_##_name##_show, NULL)
  
-ZFCP_DEFINE_DIAG_SFP_ATTR(temperature, temperature, 5, "%hu");
+ZFCP_DEFINE_DIAG_SFP_ATTR(temperature, temperature, 6, "%hd");
  ZFCP_DEFINE_DIAG_SFP_ATTR(vcc, vcc, 5, "%hu");
  ZFCP_DEFINE_DIAG_SFP_ATTR(tx_bias, tx_bias, 5, "%hu");
  ZFCP_DEFINE_DIAG_SFP_ATTR(tx_power, tx_power, 5, "%hu");
diff --git a/drivers/scsi/libfc/fc_disc.c b/drivers/scsi/libfc/fc_disc.c

index 9c5f7c9178c66351a5c74e0b1e8c12c359ae8776..2b865c6423e293b6e3b4a04cac83a5a4923d0e72 100644 (file)
--- a/drivers/scsi/libfc/fc_disc.c
+++ b/drivers/scsi/libfc/fc_disc.c
@@ -628,6 +628,8 @@ redisc:
         }
  out:
         kref_put(&rdata->kref, fc_rport_destroy);
+       if (!IS_ERR(fp))
+               fc_frame_free(fp);
  }
  
  /**
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c

index f3b36fd0a0ebd6f5d09690f12b20f4360d12ea47..b2ad96564484000ca4ca1c10bb9f984601a8ace9 100644 (file)
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -623,7 +623,8 @@ retry_alloc:
  
         fusion->io_request_frames =
                         dma_pool_alloc(fusion->io_request_frames_pool,
-                               GFP_KERNEL, &fusion->io_request_frames_phys);
+                               GFP_KERNEL | __GFP_NOWARN,
+                               &fusion->io_request_frames_phys);
         if (!fusion->io_request_frames) {
                 if (instance->max_fw_cmds >= (MEGASAS_REDUCE_QD_COUNT * 2)) {
                         instance->max_fw_cmds -= MEGASAS_REDUCE_QD_COUNT;
@@ -661,7 +662,7 @@ retry_alloc:
  
                 fusion->io_request_frames =
                         dma_pool_alloc(fusion->io_request_frames_pool,
-                                      GFP_KERNEL,
+                                      GFP_KERNEL | __GFP_NOWARN,
                                        &fusion->io_request_frames_phys);
  
                 if (!fusion->io_request_frames) {
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c

index e4282bce583475a046a679093d4ddf6cc706b105..f45c22b097269b555255731065d301f6a47b7315 100644 (file)
--- a/drivers/scsi/sd_zbc.c
+++ b/drivers/scsi/sd_zbc.c
@@ -161,6 +161,7 @@ int sd_zbc_report_zones(struct gendisk *disk, sector_t sector,
                         unsigned int nr_zones, report_zones_cb cb, void *data)
  {
         struct scsi_disk *sdkp = scsi_disk(disk);
+       sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity);
         unsigned int nr, i;
         unsigned char *buf;
         size_t offset, buflen = 0;
@@ -171,11 +172,15 @@ int sd_zbc_report_zones(struct gendisk *disk, sector_t sector,
                 /* Not a zoned device */
                 return -EOPNOTSUPP;
  
+       if (!capacity)
+               /* Device gone or invalid */
+               return -ENODEV;
+
         buf = sd_zbc_alloc_report_buffer(sdkp, nr_zones, &buflen);
         if (!buf)
                 return -ENOMEM;
  
-       while (zone_idx < nr_zones && sector < get_capacity(disk)) {
+       while (zone_idx < nr_zones && sector < capacity) {
                 ret = sd_zbc_do_report_zones(sdkp, buf, buflen,
                                 sectors_to_logical(sdkp->device, sector), true);
                 if (ret)
diff --git a/drivers/scsi/sr.c b/drivers/scsi/sr.c

index 0fbb8fe6e521b9edf427435b4d6ef2a41008054b..e4240e4ae8bbdcae4f850778deae56968c937c05 100644 (file)
--- a/drivers/scsi/sr.c
+++ b/drivers/scsi/sr.c
@@ -688,7 +688,7 @@ static const struct block_device_operations sr_bdops =
         .release        = sr_block_release,
         .ioctl          = sr_block_ioctl,
  #ifdef CONFIG_COMPAT
-       .ioctl          = sr_block_compat_ioctl,
+       .compat_ioctl   = sr_block_compat_ioctl,
  #endif
         .check_events   = sr_block_check_events,
         .revalidate_disk = sr_block_revalidate_disk,
diff --git a/drivers/soc/tegra/fuse/fuse-tegra30.c b/drivers/soc/tegra/fuse/fuse-tegra30.c

index f68f4e1c215d123959e80a4e5db9c2bdfcd23939..e6037f900fb70ab31884be5f0cc48cd2c6e4aed3 100644 (file)
--- a/drivers/soc/tegra/fuse/fuse-tegra30.c
+++ b/drivers/soc/tegra/fuse/fuse-tegra30.c
@@ -36,7 +36,8 @@
      defined(CONFIG_ARCH_TEGRA_124_SOC) || \
      defined(CONFIG_ARCH_TEGRA_132_SOC) || \
      defined(CONFIG_ARCH_TEGRA_210_SOC) || \
-    defined(CONFIG_ARCH_TEGRA_186_SOC)
+    defined(CONFIG_ARCH_TEGRA_186_SOC) || \
+    defined(CONFIG_ARCH_TEGRA_194_SOC)
  static u32 tegra30_fuse_read_early(struct tegra_fuse *fuse, unsigned int offset)
  {
         if (WARN_ON(!fuse->base))
diff --git a/drivers/spmi/spmi-pmic-arb.c b/drivers/spmi/spmi-pmic-arb.c

index 97acc2ba2912fa33396803d9554bc38d9268ab70..de844b412110702c1e17af0501be843c6fa522ed 100644 (file)
--- a/drivers/spmi/spmi-pmic-arb.c
+++ b/drivers/spmi/spmi-pmic-arb.c
@@ -731,6 +731,7 @@ static int qpnpint_irq_domain_translate(struct irq_domain *d,
         return 0;
  }
  
+static struct lock_class_key qpnpint_irq_lock_class, qpnpint_irq_request_class;
  
  static void qpnpint_irq_domain_map(struct spmi_pmic_arb *pmic_arb,
                                    struct irq_domain *domain, unsigned int virq,
@@ -746,6 +747,9 @@ static void qpnpint_irq_domain_map(struct spmi_pmic_arb *pmic_arb,
         else
                 handler = handle_level_irq;
  
+
+       irq_set_lockdep_class(virq, &qpnpint_irq_lock_class,
+                             &qpnpint_irq_request_class);
         irq_domain_set_info(domain, virq, hwirq, &pmic_arb_irqchip, pmic_arb,
                             handler, NULL, NULL);
  }
diff --git a/drivers/staging/android/Kconfig b/drivers/staging/android/Kconfig

index d6d605d5cbde919b372b533910813051b7f91e35..8d8fd5c29349a836cd3444dac91c2ffa02e10dec 100644 (file)
--- a/drivers/staging/android/Kconfig
+++ b/drivers/staging/android/Kconfig
@@ -14,14 +14,6 @@ config ASHMEM
           It is, in theory, a good memory allocator for low-memory devices,
           because it can discard shared memory units when under memory pressure.
  
-config ANDROID_VSOC
-       tristate "Android Virtual SoC support"
-       depends on PCI_MSI
-       help
-         This option adds support for the Virtual SoC driver needed to boot
-         a 'cuttlefish' Android image inside QEmu. The driver interacts with
-         a QEmu ivshmem device. If built as a module, it will be called vsoc.
-
  source "drivers/staging/android/ion/Kconfig"
  
  endif # if ANDROID
diff --git a/drivers/staging/android/Makefile b/drivers/staging/android/Makefile

index 14bd9c6ce10d3868051306e64213bf57d9dffaac..3b66cd0b0ec56d3a9d1da56b280333d3a4cdb078 100644 (file)
--- a/drivers/staging/android/Makefile
+++ b/drivers/staging/android/Makefile
@@ -4,4 +4,3 @@ ccflags-y += -I$(src)                   # needed for trace events
  obj-y                                  += ion/
  
  obj-$(CONFIG_ASHMEM)                   += ashmem.o
-obj-$(CONFIG_ANDROID_VSOC)             += vsoc.o
diff --git a/drivers/staging/android/TODO b/drivers/staging/android/TODO

index 767dd98fd92d55f91398c9e5a824782034634650..80eccfaf6db53c40f8beadee75e5f778e7532d44 100644 (file)
--- a/drivers/staging/android/TODO
+++ b/drivers/staging/android/TODO
@@ -9,14 +9,5 @@ ion/
   - Split /dev/ion up into multiple nodes (e.g. /dev/ion/heap0)
   - Better test framework (integration with VGEM was suggested)
  
-vsoc.c, uapi/vsoc_shm.h
- - The current driver uses the same wait queue for all of the futexes in a
-   region. This will cause false wakeups in regions with a large number of
-   waiting threads. We should eventually use multiple queues and select the
-   queue based on the region.
- - Add debugfs support for examining the permissions of regions.
- - Remove VSOC_WAIT_FOR_INCOMING_INTERRUPT ioctl. This functionality has been
-   superseded by the futex and is there for legacy reasons.
-
  Please send patches to Greg Kroah-Hartman <greg@kroah.com> and Cc:
  Arve Hjønnevåg <arve@android.com> and Riley Andrews <riandrews@android.com>
diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c

index 5891d0744a760b2fb8e690ac6fea2e91d96927a3..8044510d8ec673667a7ba9c90426e961a28e89c3 100644 (file)
--- a/drivers/staging/android/ashmem.c
+++ b/drivers/staging/android/ashmem.c
@@ -351,8 +351,23 @@ static inline vm_flags_t calc_vm_may_flags(unsigned long prot)
                _calc_vm_trans(prot, PROT_EXEC,  VM_MAYEXEC);
  }
  
+static int ashmem_vmfile_mmap(struct file *file, struct vm_area_struct *vma)
+{
+       /* do not allow to mmap ashmem backing shmem file directly */
+       return -EPERM;
+}
+
+static unsigned long
+ashmem_vmfile_get_unmapped_area(struct file *file, unsigned long addr,
+                               unsigned long len, unsigned long pgoff,
+                               unsigned long flags)
+{
+       return current->mm->get_unmapped_area(file, addr, len, pgoff, flags);
+}
+
  static int ashmem_mmap(struct file *file, struct vm_area_struct *vma)
  {
+       static struct file_operations vmfile_fops;
         struct ashmem_area *asma = file->private_data;
         int ret = 0;
  
@@ -393,6 +408,19 @@ static int ashmem_mmap(struct file *file, struct vm_area_struct *vma)
                 }
                 vmfile->f_mode |= FMODE_LSEEK;
                 asma->file = vmfile;
+               /*
+                * override mmap operation of the vmfile so that it can't be
+                * remapped which would lead to creation of a new vma with no
+                * asma permission checks. Have to override get_unmapped_area
+                * as well to prevent VM_BUG_ON check for f_ops modification.
+                */
+               if (!vmfile_fops.mmap) {
+                       vmfile_fops = *vmfile->f_op;
+                       vmfile_fops.mmap = ashmem_vmfile_mmap;
+                       vmfile_fops.get_unmapped_area =
+                                       ashmem_vmfile_get_unmapped_area;
+               }
+               vmfile->f_op = &vmfile_fops;
         }
         get_file(asma->file);
  
diff --git a/drivers/staging/android/uapi/vsoc_shm.h b/drivers/staging/android/uapi/vsoc_shm.h

deleted file mode 100644 (file)

index 6291fb2..0000000
--- a/drivers/staging/android/uapi/vsoc_shm.h
+++ /dev/null
@@ -1,295 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * Copyright (C) 2017 Google, Inc.
- *
- */
-
-#ifndef _UAPI_LINUX_VSOC_SHM_H
-#define _UAPI_LINUX_VSOC_SHM_H
-
-#include <linux/types.h>
-
-/**
- * A permission is a token that permits a receiver to read and/or write an area
- * of memory within a Vsoc region.
- *
- * An fd_scoped permission grants both read and write access, and can be
- * attached to a file description (see open(2)).
- * Ownership of the area can then be shared by passing a file descriptor
- * among processes.
- *
- * begin_offset and end_offset define the area of memory that is controlled by
- * the permission. owner_offset points to a word, also in shared memory, that
- * controls ownership of the area.
- *
- * ownership of the region expires when the associated file description is
- * released.
- *
- * At most one permission can be attached to each file description.
- *
- * This is useful when implementing HALs like gralloc that scope and pass
- * ownership of shared resources via file descriptors.
- *
- * The caller is responsibe for doing any fencing.
- *
- * The calling process will normally identify a currently free area of
- * memory. It will construct a proposed fd_scoped_permission_arg structure:
- *
- *   begin_offset and end_offset describe the area being claimed
- *
- *   owner_offset points to the location in shared memory that indicates the
- *   owner of the area.
- *
- *   owned_value is the value that will be stored in owner_offset iff the
- *   permission can be granted. It must be different than VSOC_REGION_FREE.
- *
- * Two fd_scoped_permission structures are compatible if they vary only by
- * their owned_value fields.
- *
- * The driver ensures that, for any group of simultaneous callers proposing
- * compatible fd_scoped_permissions, it will accept exactly one of the
- * propopsals. The other callers will get a failure with errno of EAGAIN.
- *
- * A process receiving a file descriptor can identify the region being
- * granted using the VSOC_GET_FD_SCOPED_PERMISSION ioctl.
- */
-struct fd_scoped_permission {
-       __u32 begin_offset;
-       __u32 end_offset;
-       __u32 owner_offset;
-       __u32 owned_value;
-};
-
-/*
- * This value represents a free area of memory. The driver expects to see this
- * value at owner_offset when creating a permission otherwise it will not do it,
- * and will write this value back once the permission is no longer needed.
- */
-#define VSOC_REGION_FREE ((__u32)0)
-
-/**
- * ioctl argument for VSOC_CREATE_FD_SCOPE_PERMISSION
- */
-struct fd_scoped_permission_arg {
-       struct fd_scoped_permission perm;
-       __s32 managed_region_fd;
-};
-
-#define VSOC_NODE_FREE ((__u32)0)
-
-/*
- * Describes a signal table in shared memory. Each non-zero entry in the
- * table indicates that the receiver should signal the futex at the given
- * offset. Offsets are relative to the region, not the shared memory window.
- *
- * interrupt_signalled_offset is used to reliably signal interrupts across the
- * vmm boundary. There are two roles: transmitter and receiver. For example,
- * in the host_to_guest_signal_table the host is the transmitter and the
- * guest is the receiver. The protocol is as follows:
- *
- * 1. The transmitter should convert the offset of the futex to an offset
- *    in the signal table [0, (1 << num_nodes_lg2))
- *    The transmitter can choose any appropriate hashing algorithm, including
- *    hash = futex_offset & ((1 << num_nodes_lg2) - 1)
- *
- * 3. The transmitter should atomically compare and swap futex_offset with 0
- *    at hash. There are 3 possible outcomes
- *      a. The swap fails because the futex_offset is already in the table.
- *         The transmitter should stop.
- *      b. Some other offset is in the table. This is a hash collision. The
- *         transmitter should move to another table slot and try again. One
- *         possible algorithm:
- *         hash = (hash + 1) & ((1 << num_nodes_lg2) - 1)
- *      c. The swap worked. Continue below.
- *
- * 3. The transmitter atomically swaps 1 with the value at the
- *    interrupt_signalled_offset. There are two outcomes:
- *      a. The prior value was 1. In this case an interrupt has already been
- *         posted. The transmitter is done.
- *      b. The prior value was 0, indicating that the receiver may be sleeping.
- *         The transmitter will issue an interrupt.
- *
- * 4. On waking the receiver immediately exchanges a 0 with the
- *    interrupt_signalled_offset. If it receives a 0 then this a spurious
- *    interrupt. That may occasionally happen in the current protocol, but
- *    should be rare.
- *
- * 5. The receiver scans the signal table by atomicaly exchanging 0 at each
- *    location. If a non-zero offset is returned from the exchange the
- *    receiver wakes all sleepers at the given offset:
- *      futex((int*)(region_base + old_value), FUTEX_WAKE, MAX_INT);
- *
- * 6. The receiver thread then does a conditional wait, waking immediately
- *    if the value at interrupt_signalled_offset is non-zero. This catches cases
- *    here additional  signals were posted while the table was being scanned.
- *    On the guest the wait is handled via the VSOC_WAIT_FOR_INCOMING_INTERRUPT
- *    ioctl.
- */
-struct vsoc_signal_table_layout {
-       /* log_2(Number of signal table entries) */
-       __u32 num_nodes_lg2;
-       /*
-        * Offset to the first signal table entry relative to the start of the
-        * region
-        */
-       __u32 futex_uaddr_table_offset;
-       /*
-        * Offset to an atomic_t / atomic uint32_t. A non-zero value indicates
-        * that one or more offsets are currently posted in the table.
-        * semi-unique access to an entry in the table
-        */
-       __u32 interrupt_signalled_offset;
-};
-
-#define VSOC_REGION_WHOLE ((__s32)0)
-#define VSOC_DEVICE_NAME_SZ 16
-
-/**
- * Each HAL would (usually) talk to a single device region
- * Mulitple entities care about these regions:
- * - The ivshmem_server will populate the regions in shared memory
- * - The guest kernel will read the region, create minor device nodes, and
- *   allow interested parties to register for FUTEX_WAKE events in the region
- * - HALs will access via the minor device nodes published by the guest kernel
- * - Host side processes will access the region via the ivshmem_server:
- *   1. Pass name to ivshmem_server at a UNIX socket
- *   2. ivshmemserver will reply with 2 fds:
- *     - host->guest doorbell fd
- *     - guest->host doorbell fd
- *     - fd for the shared memory region
- *     - region offset
- *   3. Start a futex receiver thread on the doorbell fd pointed at the
- *      signal_nodes
- */
-struct vsoc_device_region {
-       __u16 current_version;
-       __u16 min_compatible_version;
-       __u32 region_begin_offset;
-       __u32 region_end_offset;
-       __u32 offset_of_region_data;
-       struct vsoc_signal_table_layout guest_to_host_signal_table;
-       struct vsoc_signal_table_layout host_to_guest_signal_table;
-       /* Name of the device. Must always be terminated with a '\0', so
-        * the longest supported device name is 15 characters.
-        */
-       char device_name[VSOC_DEVICE_NAME_SZ];
-       /* There are two ways that permissions to access regions are handled:
-        *   - When subdivided_by is VSOC_REGION_WHOLE, any process that can
-        *     open the device node for the region gains complete access to it.
-        *   - When subdivided is set processes that open the region cannot
-        *     access it. Access to a sub-region must be established by invoking
-        *     the VSOC_CREATE_FD_SCOPE_PERMISSION ioctl on the region
-        *     referenced in subdivided_by, providing a fileinstance
-        *     (represented by a fd) opened on this region.
-        */
-       __u32 managed_by;
-};
-
-/*
- * The vsoc layout descriptor.
- * The first 4K should be reserved for the shm header and region descriptors.
- * The regions should be page aligned.
- */
-
-struct vsoc_shm_layout_descriptor {
-       __u16 major_version;
-       __u16 minor_version;
-
-       /* size of the shm. This may be redundant but nice to have */
-       __u32 size;
-
-       /* number of shared memory regions */
-       __u32 region_count;
-
-       /* The offset to the start of region descriptors */
-       __u32 vsoc_region_desc_offset;
-};
-
-/*
- * This specifies the current version that should be stored in
- * vsoc_shm_layout_descriptor.major_version and
- * vsoc_shm_layout_descriptor.minor_version.
- * It should be updated only if the vsoc_device_region and
- * vsoc_shm_layout_descriptor structures have changed.
- * Versioning within each region is transferred
- * via the min_compatible_version and current_version fields in
- * vsoc_device_region. The driver does not consult these fields: they are left
- * for the HALs and host processes and will change independently of the layout
- * version.
- */
-#define CURRENT_VSOC_LAYOUT_MAJOR_VERSION 2
-#define CURRENT_VSOC_LAYOUT_MINOR_VERSION 0
-
-#define VSOC_CREATE_FD_SCOPED_PERMISSION \
-       _IOW(0xF5, 0, struct fd_scoped_permission)
-#define VSOC_GET_FD_SCOPED_PERMISSION _IOR(0xF5, 1, struct fd_scoped_permission)
-
-/*
- * This is used to signal the host to scan the guest_to_host_signal_table
- * for new futexes to wake. This sends an interrupt if one is not already
- * in flight.
- */
-#define VSOC_MAYBE_SEND_INTERRUPT_TO_HOST _IO(0xF5, 2)
-
-/*
- * When this returns the guest will scan host_to_guest_signal_table to
- * check for new futexes to wake.
- */
-/* TODO(ghartman): Consider moving this to the bottom half */
-#define VSOC_WAIT_FOR_INCOMING_INTERRUPT _IO(0xF5, 3)
-
-/*
- * Guest HALs will use this to retrieve the region description after
- * opening their device node.
- */
-#define VSOC_DESCRIBE_REGION _IOR(0xF5, 4, struct vsoc_device_region)
-
-/*
- * Wake any threads that may be waiting for a host interrupt on this region.
- * This is mostly used during shutdown.
- */
-#define VSOC_SELF_INTERRUPT _IO(0xF5, 5)
-
-/*
- * This is used to signal the host to scan the guest_to_host_signal_table
- * for new futexes to wake. This sends an interrupt unconditionally.
- */
-#define VSOC_SEND_INTERRUPT_TO_HOST _IO(0xF5, 6)
-
-enum wait_types {
-       VSOC_WAIT_UNDEFINED = 0,
-       VSOC_WAIT_IF_EQUAL = 1,
-       VSOC_WAIT_IF_EQUAL_TIMEOUT = 2
-};
-
-/*
- * Wait for a condition to be true
- *
- * Note, this is sized and aligned so the 32 bit and 64 bit layouts are
- * identical.
- */
-struct vsoc_cond_wait {
-       /* Input: Offset of the 32 bit word to check */
-       __u32 offset;
-       /* Input: Value that will be compared with the offset */
-       __u32 value;
-       /* Monotonic time to wake at in seconds */
-       __u64 wake_time_sec;
-       /* Input: Monotonic time to wait in nanoseconds */
-       __u32 wake_time_nsec;
-       /* Input: Type of wait */
-       __u32 wait_type;
-       /* Output: Number of times the thread woke before returning. */
-       __u32 wakes;
-       /* Ensure that we're 8-byte aligned and 8 byte length for 32/64 bit
-        * compatibility.
-        */
-       __u32 reserved_1;
-};
-
-#define VSOC_COND_WAIT _IOWR(0xF5, 7, struct vsoc_cond_wait)
-
-/* Wake any local threads waiting at the offset given in arg */
-#define VSOC_COND_WAKE _IO(0xF5, 8)
-
-#endif /* _UAPI_LINUX_VSOC_SHM_H */
diff --git a/drivers/staging/android/vsoc.c b/drivers/staging/android/vsoc.c

deleted file mode 100644 (file)

index 1240bb0..0000000
--- a/drivers/staging/android/vsoc.c
+++ /dev/null
@@ -1,1149 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * drivers/android/staging/vsoc.c
- *
- * Android Virtual System on a Chip (VSoC) driver
- *
- * Copyright (C) 2017 Google, Inc.
- *
- * Author: ghartman@google.com
- *
- * Based on drivers/char/kvm_ivshmem.c - driver for KVM Inter-VM shared memory
- *         Copyright 2009 Cam Macdonell <cam@cs.ualberta.ca>
- *
- * Based on cirrusfb.c and 8139cp.c:
- *   Copyright 1999-2001 Jeff Garzik
- *   Copyright 2001-2004 Jeff Garzik
- */
-
-#include <linux/dma-mapping.h>
-#include <linux/freezer.h>
-#include <linux/futex.h>
-#include <linux/init.h>
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/mutex.h>
-#include <linux/pci.h>
-#include <linux/proc_fs.h>
-#include <linux/sched.h>
-#include <linux/syscalls.h>
-#include <linux/uaccess.h>
-#include <linux/interrupt.h>
-#include <linux/cdev.h>
-#include <linux/file.h>
-#include "uapi/vsoc_shm.h"
-
-#define VSOC_DEV_NAME "vsoc"
-
-/*
- * Description of the ivshmem-doorbell PCI device used by QEmu. These
- * constants follow docs/specs/ivshmem-spec.txt, which can be found in
- * the QEmu repository. This was last reconciled with the version that
- * came out with 2.8
- */
-
-/*
- * These constants are determined KVM Inter-VM shared memory device
- * register offsets
- */
-enum {
-       INTR_MASK = 0x00,       /* Interrupt Mask */
-       INTR_STATUS = 0x04,     /* Interrupt Status */
-       IV_POSITION = 0x08,     /* VM ID */
-       DOORBELL = 0x0c,        /* Doorbell */
-};
-
-static const int REGISTER_BAR;  /* Equal to 0 */
-static const int MAX_REGISTER_BAR_LEN = 0x100;
-/*
- * The MSI-x BAR is not used directly.
- *
- * static const int MSI_X_BAR = 1;
- */
-static const int SHARED_MEMORY_BAR = 2;
-
-struct vsoc_region_data {
-       char name[VSOC_DEVICE_NAME_SZ + 1];
-       wait_queue_head_t interrupt_wait_queue;
-       /* TODO(b/73664181): Use multiple futex wait queues */
-       wait_queue_head_t futex_wait_queue;
-       /* Flag indicating that an interrupt has been signalled by the host. */
-       atomic_t *incoming_signalled;
-       /* Flag indicating the guest has signalled the host. */
-       atomic_t *outgoing_signalled;
-       bool irq_requested;
-       bool device_created;
-};
-
-struct vsoc_device {
-       /* Kernel virtual address of REGISTER_BAR. */
-       void __iomem *regs;
-       /* Physical address of SHARED_MEMORY_BAR. */
-       phys_addr_t shm_phys_start;
-       /* Kernel virtual address of SHARED_MEMORY_BAR. */
-       void __iomem *kernel_mapped_shm;
-       /* Size of the entire shared memory window in bytes. */
-       size_t shm_size;
-       /*
-        * Pointer to the virtual address of the shared memory layout structure.
-        * This is probably identical to kernel_mapped_shm, but saving this
-        * here saves a lot of annoying casts.
-        */
-       struct vsoc_shm_layout_descriptor *layout;
-       /*
-        * Points to a table of region descriptors in the kernel's virtual
-        * address space. Calculated from
-        * vsoc_shm_layout_descriptor.vsoc_region_desc_offset
-        */
-       struct vsoc_device_region *regions;
-       /* Head of a list of permissions that have been granted. */
-       struct list_head permissions;
-       struct pci_dev *dev;
-       /* Per-region (and therefore per-interrupt) information. */
-       struct vsoc_region_data *regions_data;
-       /*
-        * Table of msi-x entries. This has to be separated from struct
-        * vsoc_region_data because the kernel deals with them as an array.
-        */
-       struct msix_entry *msix_entries;
-       /* Mutex that protectes the permission list */
-       struct mutex mtx;
-       /* Major number assigned by the kernel */
-       int major;
-       /* Character device assigned by the kernel */
-       struct cdev cdev;
-       /* Device class assigned by the kernel */
-       struct class *class;
-       /*
-        * Flags that indicate what we've initialized. These are used to do an
-        * orderly cleanup of the device.
-        */
-       bool enabled_device;
-       bool requested_regions;
-       bool cdev_added;
-       bool class_added;
-       bool msix_enabled;
-};
-
-static struct vsoc_device vsoc_dev;
-
-/*
- * TODO(ghartman): Add a /sys filesystem entry that summarizes the permissions.
- */
-
-struct fd_scoped_permission_node {
-       struct fd_scoped_permission permission;
-       struct list_head list;
-};
-
-struct vsoc_private_data {
-       struct fd_scoped_permission_node *fd_scoped_permission_node;
-};
-
-static long vsoc_ioctl(struct file *, unsigned int, unsigned long);
-static int vsoc_mmap(struct file *, struct vm_area_struct *);
-static int vsoc_open(struct inode *, struct file *);
-static int vsoc_release(struct inode *, struct file *);
-static ssize_t vsoc_read(struct file *, char __user *, size_t, loff_t *);
-static ssize_t vsoc_write(struct file *, const char __user *, size_t, loff_t *);
-static loff_t vsoc_lseek(struct file *filp, loff_t offset, int origin);
-static int
-do_create_fd_scoped_permission(struct vsoc_device_region *region_p,
-                              struct fd_scoped_permission_node *np,
-                              struct fd_scoped_permission_arg __user *arg);
-static void
-do_destroy_fd_scoped_permission(struct vsoc_device_region *owner_region_p,
-                               struct fd_scoped_permission *perm);
-static long do_vsoc_describe_region(struct file *,
-                                   struct vsoc_device_region __user *);
-static ssize_t vsoc_get_area(struct file *filp, __u32 *perm_off);
-
-/**
- * Validate arguments on entry points to the driver.
- */
-inline int vsoc_validate_inode(struct inode *inode)
-{
-       if (iminor(inode) >= vsoc_dev.layout->region_count) {
-               dev_err(&vsoc_dev.dev->dev,
-                       "describe_region: invalid region %d\n", iminor(inode));
-               return -ENODEV;
-       }
-       return 0;
-}
-
-inline int vsoc_validate_filep(struct file *filp)
-{
-       int ret = vsoc_validate_inode(file_inode(filp));
-
-       if (ret)
-               return ret;
-       if (!filp->private_data) {
-               dev_err(&vsoc_dev.dev->dev,
-                       "No private data on fd, region %d\n",
-                       iminor(file_inode(filp)));
-               return -EBADFD;
-       }
-       return 0;
-}
-
-/* Converts from shared memory offset to virtual address */
-static inline void *shm_off_to_virtual_addr(__u32 offset)
-{
-       return (void __force *)vsoc_dev.kernel_mapped_shm + offset;
-}
-
-/* Converts from shared memory offset to physical address */
-static inline phys_addr_t shm_off_to_phys_addr(__u32 offset)
-{
-       return vsoc_dev.shm_phys_start + offset;
-}
-
-/**
- * Convenience functions to obtain the region from the inode or file.
- * Dangerous to call before validating the inode/file.
- */
-static
-inline struct vsoc_device_region *vsoc_region_from_inode(struct inode *inode)
-{
-       return &vsoc_dev.regions[iminor(inode)];
-}
-
-static
-inline struct vsoc_device_region *vsoc_region_from_filep(struct file *inode)
-{
-       return vsoc_region_from_inode(file_inode(inode));
-}
-
-static inline uint32_t vsoc_device_region_size(struct vsoc_device_region *r)
-{
-       return r->region_end_offset - r->region_begin_offset;
-}
-
-static const struct file_operations vsoc_ops = {
-       .owner = THIS_MODULE,
-       .open = vsoc_open,
-       .mmap = vsoc_mmap,
-       .read = vsoc_read,
-       .unlocked_ioctl = vsoc_ioctl,
-       .compat_ioctl = vsoc_ioctl,
-       .write = vsoc_write,
-       .llseek = vsoc_lseek,
-       .release = vsoc_release,
-};
-
-static struct pci_device_id vsoc_id_table[] = {
-       {0x1af4, 0x1110, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0},
-       {0},
-};
-
-MODULE_DEVICE_TABLE(pci, vsoc_id_table);
-
-static void vsoc_remove_device(struct pci_dev *pdev);
-static int vsoc_probe_device(struct pci_dev *pdev,
-                            const struct pci_device_id *ent);
-
-static struct pci_driver vsoc_pci_driver = {
-       .name = "vsoc",
-       .id_table = vsoc_id_table,
-       .probe = vsoc_probe_device,
-       .remove = vsoc_remove_device,
-};
-
-static int
-do_create_fd_scoped_permission(struct vsoc_device_region *region_p,
-                              struct fd_scoped_permission_node *np,
-                              struct fd_scoped_permission_arg __user *arg)
-{
-       struct file *managed_filp;
-       s32 managed_fd;
-       atomic_t *owner_ptr = NULL;
-       struct vsoc_device_region *managed_region_p;
-
-       if (copy_from_user(&np->permission,
-                          &arg->perm, sizeof(np->permission)) ||
-           copy_from_user(&managed_fd,
-                          &arg->managed_region_fd, sizeof(managed_fd))) {
-               return -EFAULT;
-       }
-       managed_filp = fdget(managed_fd).file;
-       /* Check that it's a valid fd, */
-       if (!managed_filp || vsoc_validate_filep(managed_filp))
-               return -EPERM;
-       /* EEXIST if the given fd already has a permission. */
-       if (((struct vsoc_private_data *)managed_filp->private_data)->
-           fd_scoped_permission_node)
-               return -EEXIST;
-       managed_region_p = vsoc_region_from_filep(managed_filp);
-       /* Check that the provided region is managed by this one */
-       if (&vsoc_dev.regions[managed_region_p->managed_by] != region_p)
-               return -EPERM;
-       /* The area must be well formed and have non-zero size */
-       if (np->permission.begin_offset >= np->permission.end_offset)
-               return -EINVAL;
-       /* The area must fit in the memory window */
-       if (np->permission.end_offset >
-           vsoc_device_region_size(managed_region_p))
-               return -ERANGE;
-       /* The area must be in the region data section */
-       if (np->permission.begin_offset <
-           managed_region_p->offset_of_region_data)
-               return -ERANGE;
-       /* The area must be page aligned */
-       if (!PAGE_ALIGNED(np->permission.begin_offset) ||
-           !PAGE_ALIGNED(np->permission.end_offset))
-               return -EINVAL;
-       /* Owner offset must be naturally aligned in the window */
-       if (np->permission.owner_offset &
-           (sizeof(np->permission.owner_offset) - 1))
-               return -EINVAL;
-       /* The owner flag must reside in the owner memory */
-       if (np->permission.owner_offset + sizeof(np->permission.owner_offset) >
-           vsoc_device_region_size(region_p))
-               return -ERANGE;
-       /* The owner flag must reside in the data section */
-       if (np->permission.owner_offset < region_p->offset_of_region_data)
-               return -EINVAL;
-       /* The owner value must change to claim the memory */
-       if (np->permission.owned_value == VSOC_REGION_FREE)
-               return -EINVAL;
-       owner_ptr =
-           (atomic_t *)shm_off_to_virtual_addr(region_p->region_begin_offset +
-                                               np->permission.owner_offset);
-       /* We've already verified that this is in the shared memory window, so
-        * it should be safe to write to this address.
-        */
-       if (atomic_cmpxchg(owner_ptr,
-                          VSOC_REGION_FREE,
-                          np->permission.owned_value) != VSOC_REGION_FREE) {
-               return -EBUSY;
-       }
-       ((struct vsoc_private_data *)managed_filp->private_data)->
-           fd_scoped_permission_node = np;
-       /* The file offset needs to be adjusted if the calling
-        * process did any read/write operations on the fd
-        * before creating the permission.
-        */
-       if (managed_filp->f_pos) {
-               if (managed_filp->f_pos > np->permission.end_offset) {
-                       /* If the offset is beyond the permission end, set it
-                        * to the end.
-                        */
-                       managed_filp->f_pos = np->permission.end_offset;
-               } else {
-                       /* If the offset is within the permission interval
-                        * keep it there otherwise reset it to zero.
-                        */
-                       if (managed_filp->f_pos < np->permission.begin_offset) {
-                               managed_filp->f_pos = 0;
-                       } else {
-                               managed_filp->f_pos -=
-                                   np->permission.begin_offset;
-                       }
-               }
-       }
-       return 0;
-}
-
-static void
-do_destroy_fd_scoped_permission_node(struct vsoc_device_region *owner_region_p,
-                                    struct fd_scoped_permission_node *node)
-{
-       if (node) {
-               do_destroy_fd_scoped_permission(owner_region_p,
-                                               &node->permission);
-               mutex_lock(&vsoc_dev.mtx);
-               list_del(&node->list);
-               mutex_unlock(&vsoc_dev.mtx);
-               kfree(node);
-       }
-}
-
-static void
-do_destroy_fd_scoped_permission(struct vsoc_device_region *owner_region_p,
-                               struct fd_scoped_permission *perm)
-{
-       atomic_t *owner_ptr = NULL;
-       int prev = 0;
-
-       if (!perm)
-               return;
-       owner_ptr = (atomic_t *)shm_off_to_virtual_addr
-               (owner_region_p->region_begin_offset + perm->owner_offset);
-       prev = atomic_xchg(owner_ptr, VSOC_REGION_FREE);
-       if (prev != perm->owned_value)
-               dev_err(&vsoc_dev.dev->dev,
-                       "%x-%x: owner (%s) %x: expected to be %x was %x",
-                       perm->begin_offset, perm->end_offset,
-                       owner_region_p->device_name, perm->owner_offset,
-                       perm->owned_value, prev);
-}
-
-static long do_vsoc_describe_region(struct file *filp,
-                                   struct vsoc_device_region __user *dest)
-{
-       struct vsoc_device_region *region_p;
-       int retval = vsoc_validate_filep(filp);
-
-       if (retval)
-               return retval;
-       region_p = vsoc_region_from_filep(filp);
-       if (copy_to_user(dest, region_p, sizeof(*region_p)))
-               return -EFAULT;
-       return 0;
-}
-
-/**
- * Implements the inner logic of cond_wait. Copies to and from userspace are
- * done in the helper function below.
- */
-static int handle_vsoc_cond_wait(struct file *filp, struct vsoc_cond_wait *arg)
-{
-       DEFINE_WAIT(wait);
-       u32 region_number = iminor(file_inode(filp));
-       struct vsoc_region_data *data = vsoc_dev.regions_data + region_number;
-       struct hrtimer_sleeper timeout, *to = NULL;
-       int ret = 0;
-       struct vsoc_device_region *region_p = vsoc_region_from_filep(filp);
-       atomic_t *address = NULL;
-       ktime_t wake_time;
-
-       /* Ensure that the offset is aligned */
-       if (arg->offset & (sizeof(uint32_t) - 1))
-               return -EADDRNOTAVAIL;
-       /* Ensure that the offset is within shared memory */
-       if (((uint64_t)arg->offset) + region_p->region_begin_offset +
-           sizeof(uint32_t) > region_p->region_end_offset)
-               return -E2BIG;
-       address = shm_off_to_virtual_addr(region_p->region_begin_offset +
-                                         arg->offset);
-
-       /* Ensure that the type of wait is valid */
-       switch (arg->wait_type) {
-       case VSOC_WAIT_IF_EQUAL:
-               break;
-       case VSOC_WAIT_IF_EQUAL_TIMEOUT:
-               to = &timeout;
-               break;
-       default:
-               return -EINVAL;
-       }
-
-       if (to) {
-               /* Copy the user-supplied timesec into the kernel structure.
-                * We do things this way to flatten differences between 32 bit
-                * and 64 bit timespecs.
-                */
-               if (arg->wake_time_nsec >= NSEC_PER_SEC)
-                       return -EINVAL;
-               wake_time = ktime_set(arg->wake_time_sec, arg->wake_time_nsec);
-
-               hrtimer_init_sleeper_on_stack(to, CLOCK_MONOTONIC,
-                                             HRTIMER_MODE_ABS);
-               hrtimer_set_expires_range_ns(&to->timer, wake_time,
-                                            current->timer_slack_ns);
-       }
-
-       while (1) {
-               prepare_to_wait(&data->futex_wait_queue, &wait,
-                               TASK_INTERRUPTIBLE);
-               /*
-                * Check the sentinel value after prepare_to_wait. If the value
-                * changes after this check the writer will call signal,
-                * changing the task state from INTERRUPTIBLE to RUNNING. That
-                * will ensure that schedule() will eventually schedule this
-                * task.
-                */
-               if (atomic_read(address) != arg->value) {
-                       ret = 0;
-                       break;
-               }
-               if (to) {
-                       hrtimer_sleeper_start_expires(to, HRTIMER_MODE_ABS);
-                       if (likely(to->task))
-                               freezable_schedule();
-                       hrtimer_cancel(&to->timer);
-                       if (!to->task) {
-                               ret = -ETIMEDOUT;
-                               break;
-                       }
-               } else {
-                       freezable_schedule();
-               }
-               /* Count the number of times that we woke up. This is useful
-                * for unit testing.
-                */
-               ++arg->wakes;
-               if (signal_pending(current)) {
-                       ret = -EINTR;
-                       break;
-               }
-       }
-       finish_wait(&data->futex_wait_queue, &wait);
-       if (to)
-               destroy_hrtimer_on_stack(&to->timer);
-       return ret;
-}
-
-/**
- * Handles the details of copying from/to userspace to ensure that the copies
- * happen on all of the return paths of cond_wait.
- */
-static int do_vsoc_cond_wait(struct file *filp,
-                            struct vsoc_cond_wait __user *untrusted_in)
-{
-       struct vsoc_cond_wait arg;
-       int rval = 0;
-
-       if (copy_from_user(&arg, untrusted_in, sizeof(arg)))
-               return -EFAULT;
-       /* wakes is an out parameter. Initialize it to something sensible. */
-       arg.wakes = 0;
-       rval = handle_vsoc_cond_wait(filp, &arg);
-       if (copy_to_user(untrusted_in, &arg, sizeof(arg)))
-               return -EFAULT;
-       return rval;
-}
-
-static int do_vsoc_cond_wake(struct file *filp, uint32_t offset)
-{
-       struct vsoc_device_region *region_p = vsoc_region_from_filep(filp);
-       u32 region_number = iminor(file_inode(filp));
-       struct vsoc_region_data *data = vsoc_dev.regions_data + region_number;
-       /* Ensure that the offset is aligned */
-       if (offset & (sizeof(uint32_t) - 1))
-               return -EADDRNOTAVAIL;
-       /* Ensure that the offset is within shared memory */
-       if (((uint64_t)offset) + region_p->region_begin_offset +
-           sizeof(uint32_t) > region_p->region_end_offset)
-               return -E2BIG;
-       /*
-        * TODO(b/73664181): Use multiple futex wait queues.
-        * We need to wake every sleeper when the condition changes. Typically
-        * only a single thread will be waiting on the condition, but there
-        * are exceptions. The worst case is about 10 threads.
-        */
-       wake_up_interruptible_all(&data->futex_wait_queue);
-       return 0;
-}
-
-static long vsoc_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
-{
-       int rv = 0;
-       struct vsoc_device_region *region_p;
-       u32 reg_num;
-       struct vsoc_region_data *reg_data;
-       int retval = vsoc_validate_filep(filp);
-
-       if (retval)
-               return retval;
-       region_p = vsoc_region_from_filep(filp);
-       reg_num = iminor(file_inode(filp));
-       reg_data = vsoc_dev.regions_data + reg_num;
-       switch (cmd) {
-       case VSOC_CREATE_FD_SCOPED_PERMISSION:
-               {
-                       struct fd_scoped_permission_node *node = NULL;
-
-                       node = kzalloc(sizeof(*node), GFP_KERNEL);
-                       /* We can't allocate memory for the permission */
-                       if (!node)
-                               return -ENOMEM;
-                       INIT_LIST_HEAD(&node->list);
-                       rv = do_create_fd_scoped_permission
-                               (region_p,
-                                node,
-                                (struct fd_scoped_permission_arg __user *)arg);
-                       if (!rv) {
-                               mutex_lock(&vsoc_dev.mtx);
-                               list_add(&node->list, &vsoc_dev.permissions);
-                               mutex_unlock(&vsoc_dev.mtx);
-                       } else {
-                               kfree(node);
-                               return rv;
-                       }
-               }
-               break;
-
-       case VSOC_GET_FD_SCOPED_PERMISSION:
-               {
-                       struct fd_scoped_permission_node *node =
-                           ((struct vsoc_private_data *)filp->private_data)->
-                           fd_scoped_permission_node;
-                       if (!node)
-                               return -ENOENT;
-                       if (copy_to_user
-                           ((struct fd_scoped_permission __user *)arg,
-                            &node->permission, sizeof(node->permission)))
-                               return -EFAULT;
-               }
-               break;
-
-       case VSOC_MAYBE_SEND_INTERRUPT_TO_HOST:
-               if (!atomic_xchg(reg_data->outgoing_signalled, 1)) {
-                       writel(reg_num, vsoc_dev.regs + DOORBELL);
-                       return 0;
-               } else {
-                       return -EBUSY;
-               }
-               break;
-
-       case VSOC_SEND_INTERRUPT_TO_HOST:
-               writel(reg_num, vsoc_dev.regs + DOORBELL);
-               return 0;
-       case VSOC_WAIT_FOR_INCOMING_INTERRUPT:
-               wait_event_interruptible
-                       (reg_data->interrupt_wait_queue,
-                        (atomic_read(reg_data->incoming_signalled) != 0));
-               break;
-
-       case VSOC_DESCRIBE_REGION:
-               return do_vsoc_describe_region
-                       (filp,
-                        (struct vsoc_device_region __user *)arg);
-
-       case VSOC_SELF_INTERRUPT:
-               atomic_set(reg_data->incoming_signalled, 1);
-               wake_up_interruptible(&reg_data->interrupt_wait_queue);
-               break;
-
-       case VSOC_COND_WAIT:
-               return do_vsoc_cond_wait(filp,
-                                        (struct vsoc_cond_wait __user *)arg);
-       case VSOC_COND_WAKE:
-               return do_vsoc_cond_wake(filp, arg);
-
-       default:
-               return -EINVAL;
-       }
-       return 0;
-}
-
-static ssize_t vsoc_read(struct file *filp, char __user *buffer, size_t len,
-                        loff_t *poffset)
-{
-       __u32 area_off;
-       const void *area_p;
-       ssize_t area_len;
-       int retval = vsoc_validate_filep(filp);
-
-       if (retval)
-               return retval;
-       area_len = vsoc_get_area(filp, &area_off);
-       area_p = shm_off_to_virtual_addr(area_off);
-       area_p += *poffset;
-       area_len -= *poffset;
-       if (area_len <= 0)
-               return 0;
-       if (area_len < len)
-               len = area_len;
-       if (copy_to_user(buffer, area_p, len))
-               return -EFAULT;
-       *poffset += len;
-       return len;
-}
-
-static loff_t vsoc_lseek(struct file *filp, loff_t offset, int origin)
-{
-       ssize_t area_len = 0;
-       int retval = vsoc_validate_filep(filp);
-
-       if (retval)
-               return retval;
-       area_len = vsoc_get_area(filp, NULL);
-       switch (origin) {
-       case SEEK_SET:
-               break;
-
-       case SEEK_CUR:
-               if (offset > 0 && offset + filp->f_pos < 0)
-                       return -EOVERFLOW;
-               offset += filp->f_pos;
-               break;
-
-       case SEEK_END:
-               if (offset > 0 && offset + area_len < 0)
-                       return -EOVERFLOW;
-               offset += area_len;
-               break;
-
-       case SEEK_DATA:
-               if (offset >= area_len)
-                       return -EINVAL;
-               if (offset < 0)
-                       offset = 0;
-               break;
-
-       case SEEK_HOLE:
-               /* Next hole is always the end of the region, unless offset is
-                * beyond that
-                */
-               if (offset < area_len)
-                       offset = area_len;
-               break;
-
-       default:
-               return -EINVAL;
-       }
-
-       if (offset < 0 || offset > area_len)
-               return -EINVAL;
-       filp->f_pos = offset;
-
-       return offset;
-}
-
-static ssize_t vsoc_write(struct file *filp, const char __user *buffer,
-                         size_t len, loff_t *poffset)
-{
-       __u32 area_off;
-       void *area_p;
-       ssize_t area_len;
-       int retval = vsoc_validate_filep(filp);
-
-       if (retval)
-               return retval;
-       area_len = vsoc_get_area(filp, &area_off);
-       area_p = shm_off_to_virtual_addr(area_off);
-       area_p += *poffset;
-       area_len -= *poffset;
-       if (area_len <= 0)
-               return 0;
-       if (area_len < len)
-               len = area_len;
-       if (copy_from_user(area_p, buffer, len))
-               return -EFAULT;
-       *poffset += len;
-       return len;
-}
-
-static irqreturn_t vsoc_interrupt(int irq, void *region_data_v)
-{
-       struct vsoc_region_data *region_data =
-           (struct vsoc_region_data *)region_data_v;
-       int reg_num = region_data - vsoc_dev.regions_data;
-
-       if (unlikely(!region_data))
-               return IRQ_NONE;
-
-       if (unlikely(reg_num < 0 ||
-                    reg_num >= vsoc_dev.layout->region_count)) {
-               dev_err(&vsoc_dev.dev->dev,
-                       "invalid irq @%p reg_num=0x%04x\n",
-                       region_data, reg_num);
-               return IRQ_NONE;
-       }
-       if (unlikely(vsoc_dev.regions_data + reg_num != region_data)) {
-               dev_err(&vsoc_dev.dev->dev,
-                       "irq not aligned @%p reg_num=0x%04x\n",
-                       region_data, reg_num);
-               return IRQ_NONE;
-       }
-       wake_up_interruptible(&region_data->interrupt_wait_queue);
-       return IRQ_HANDLED;
-}
-
-static int vsoc_probe_device(struct pci_dev *pdev,
-                            const struct pci_device_id *ent)
-{
-       int result;
-       int i;
-       resource_size_t reg_size;
-       dev_t devt;
-
-       vsoc_dev.dev = pdev;
-       result = pci_enable_device(pdev);
-       if (result) {
-               dev_err(&pdev->dev,
-                       "pci_enable_device failed %s: error %d\n",
-                       pci_name(pdev), result);
-               return result;
-       }
-       vsoc_dev.enabled_device = true;
-       result = pci_request_regions(pdev, "vsoc");
-       if (result < 0) {
-               dev_err(&pdev->dev, "pci_request_regions failed\n");
-               vsoc_remove_device(pdev);
-               return -EBUSY;
-       }
-       vsoc_dev.requested_regions = true;
-       /* Set up the control registers in BAR 0 */
-       reg_size = pci_resource_len(pdev, REGISTER_BAR);
-       if (reg_size > MAX_REGISTER_BAR_LEN)
-               vsoc_dev.regs =
-                   pci_iomap(pdev, REGISTER_BAR, MAX_REGISTER_BAR_LEN);
-       else
-               vsoc_dev.regs = pci_iomap(pdev, REGISTER_BAR, reg_size);
-
-       if (!vsoc_dev.regs) {
-               dev_err(&pdev->dev,
-                       "cannot map registers of size %zu\n",
-                      (size_t)reg_size);
-               vsoc_remove_device(pdev);
-               return -EBUSY;
-       }
-
-       /* Map the shared memory in BAR 2 */
-       vsoc_dev.shm_phys_start = pci_resource_start(pdev, SHARED_MEMORY_BAR);
-       vsoc_dev.shm_size = pci_resource_len(pdev, SHARED_MEMORY_BAR);
-
-       dev_info(&pdev->dev, "shared memory @ DMA %pa size=0x%zx\n",
-                &vsoc_dev.shm_phys_start, vsoc_dev.shm_size);
-       vsoc_dev.kernel_mapped_shm = pci_iomap_wc(pdev, SHARED_MEMORY_BAR, 0);
-       if (!vsoc_dev.kernel_mapped_shm) {
-               dev_err(&vsoc_dev.dev->dev, "cannot iomap region\n");
-               vsoc_remove_device(pdev);
-               return -EBUSY;
-       }
-
-       vsoc_dev.layout = (struct vsoc_shm_layout_descriptor __force *)
-                               vsoc_dev.kernel_mapped_shm;
-       dev_info(&pdev->dev, "major_version: %d\n",
-                vsoc_dev.layout->major_version);
-       dev_info(&pdev->dev, "minor_version: %d\n",
-                vsoc_dev.layout->minor_version);
-       dev_info(&pdev->dev, "size: 0x%x\n", vsoc_dev.layout->size);
-       dev_info(&pdev->dev, "regions: %d\n", vsoc_dev.layout->region_count);
-       if (vsoc_dev.layout->major_version !=
-           CURRENT_VSOC_LAYOUT_MAJOR_VERSION) {
-               dev_err(&vsoc_dev.dev->dev,
-                       "driver supports only major_version %d\n",
-                       CURRENT_VSOC_LAYOUT_MAJOR_VERSION);
-               vsoc_remove_device(pdev);
-               return -EBUSY;
-       }
-       result = alloc_chrdev_region(&devt, 0, vsoc_dev.layout->region_count,
-                                    VSOC_DEV_NAME);
-       if (result) {
-               dev_err(&vsoc_dev.dev->dev, "alloc_chrdev_region failed\n");
-               vsoc_remove_device(pdev);
-               return -EBUSY;
-       }
-       vsoc_dev.major = MAJOR(devt);
-       cdev_init(&vsoc_dev.cdev, &vsoc_ops);
-       vsoc_dev.cdev.owner = THIS_MODULE;
-       result = cdev_add(&vsoc_dev.cdev, devt, vsoc_dev.layout->region_count);
-       if (result) {
-               dev_err(&vsoc_dev.dev->dev, "cdev_add error\n");
-               vsoc_remove_device(pdev);
-               return -EBUSY;
-       }
-       vsoc_dev.cdev_added = true;
-       vsoc_dev.class = class_create(THIS_MODULE, VSOC_DEV_NAME);
-       if (IS_ERR(vsoc_dev.class)) {
-               dev_err(&vsoc_dev.dev->dev, "class_create failed\n");
-               vsoc_remove_device(pdev);
-               return PTR_ERR(vsoc_dev.class);
-       }
-       vsoc_dev.class_added = true;
-       vsoc_dev.regions = (struct vsoc_device_region __force *)
-               ((void *)vsoc_dev.layout +
-                vsoc_dev.layout->vsoc_region_desc_offset);
-       vsoc_dev.msix_entries =
-               kcalloc(vsoc_dev.layout->region_count,
-                       sizeof(vsoc_dev.msix_entries[0]), GFP_KERNEL);
-       if (!vsoc_dev.msix_entries) {
-               dev_err(&vsoc_dev.dev->dev,
-                       "unable to allocate msix_entries\n");
-               vsoc_remove_device(pdev);
-               return -ENOSPC;
-       }
-       vsoc_dev.regions_data =
-               kcalloc(vsoc_dev.layout->region_count,
-                       sizeof(vsoc_dev.regions_data[0]), GFP_KERNEL);
-       if (!vsoc_dev.regions_data) {
-               dev_err(&vsoc_dev.dev->dev,
-                       "unable to allocate regions' data\n");
-               vsoc_remove_device(pdev);
-               return -ENOSPC;
-       }
-       for (i = 0; i < vsoc_dev.layout->region_count; ++i)
-               vsoc_dev.msix_entries[i].entry = i;
-
-       result = pci_enable_msix_exact(vsoc_dev.dev, vsoc_dev.msix_entries,
-                                      vsoc_dev.layout->region_count);
-       if (result) {
-               dev_info(&pdev->dev, "pci_enable_msix failed: %d\n", result);
-               vsoc_remove_device(pdev);
-               return -ENOSPC;
-       }
-       /* Check that all regions are well formed */
-       for (i = 0; i < vsoc_dev.layout->region_count; ++i) {
-               const struct vsoc_device_region *region = vsoc_dev.regions + i;
-
-               if (!PAGE_ALIGNED(region->region_begin_offset) ||
-                   !PAGE_ALIGNED(region->region_end_offset)) {
-                       dev_err(&vsoc_dev.dev->dev,
-                               "region %d not aligned (%x:%x)", i,
-                               region->region_begin_offset,
-                               region->region_end_offset);
-                       vsoc_remove_device(pdev);
-                       return -EFAULT;
-               }
-               if (region->region_begin_offset >= region->region_end_offset ||
-                   region->region_end_offset > vsoc_dev.shm_size) {
-                       dev_err(&vsoc_dev.dev->dev,
-                               "region %d offsets are wrong: %x %x %zx",
-                               i, region->region_begin_offset,
-                               region->region_end_offset, vsoc_dev.shm_size);
-                       vsoc_remove_device(pdev);
-                       return -EFAULT;
-               }
-               if (region->managed_by >= vsoc_dev.layout->region_count) {
-                       dev_err(&vsoc_dev.dev->dev,
-                               "region %d has invalid owner: %u",
-                               i, region->managed_by);
-                       vsoc_remove_device(pdev);
-                       return -EFAULT;
-               }
-       }
-       vsoc_dev.msix_enabled = true;
-       for (i = 0; i < vsoc_dev.layout->region_count; ++i) {
-               const struct vsoc_device_region *region = vsoc_dev.regions + i;
-               size_t name_sz = sizeof(vsoc_dev.regions_data[i].name) - 1;
-               const struct vsoc_signal_table_layout *h_to_g_signal_table =
-                       &region->host_to_guest_signal_table;
-               const struct vsoc_signal_table_layout *g_to_h_signal_table =
-                       &region->guest_to_host_signal_table;
-
-               vsoc_dev.regions_data[i].name[name_sz] = '\0';
-               memcpy(vsoc_dev.regions_data[i].name, region->device_name,
-                      name_sz);
-               dev_info(&pdev->dev, "region %d name=%s\n",
-                        i, vsoc_dev.regions_data[i].name);
-               init_waitqueue_head
-                       (&vsoc_dev.regions_data[i].interrupt_wait_queue);
-               init_waitqueue_head(&vsoc_dev.regions_data[i].futex_wait_queue);
-               vsoc_dev.regions_data[i].incoming_signalled =
-                       shm_off_to_virtual_addr(region->region_begin_offset) +
-                       h_to_g_signal_table->interrupt_signalled_offset;
-               vsoc_dev.regions_data[i].outgoing_signalled =
-                       shm_off_to_virtual_addr(region->region_begin_offset) +
-                       g_to_h_signal_table->interrupt_signalled_offset;
-               result = request_irq(vsoc_dev.msix_entries[i].vector,
-                                    vsoc_interrupt, 0,
-                                    vsoc_dev.regions_data[i].name,
-                                    vsoc_dev.regions_data + i);
-               if (result) {
-                       dev_info(&pdev->dev,
-                                "request_irq failed irq=%d vector=%d\n",
-                               i, vsoc_dev.msix_entries[i].vector);
-                       vsoc_remove_device(pdev);
-                       return -ENOSPC;
-               }
-               vsoc_dev.regions_data[i].irq_requested = true;
-               if (!device_create(vsoc_dev.class, NULL,
-                                  MKDEV(vsoc_dev.major, i),
-                                  NULL, vsoc_dev.regions_data[i].name)) {
-                       dev_err(&vsoc_dev.dev->dev, "device_create failed\n");
-                       vsoc_remove_device(pdev);
-                       return -EBUSY;
-               }
-               vsoc_dev.regions_data[i].device_created = true;
-       }
-       return 0;
-}
-
-/*
- * This should undo all of the allocations in the probe function in reverse
- * order.
- *
- * Notes:
- *
- *   The device may have been partially initialized, so double check
- *   that the allocations happened.
- *
- *   This function may be called multiple times, so mark resources as freed
- *   as they are deallocated.
- */
-static void vsoc_remove_device(struct pci_dev *pdev)
-{
-       int i;
-       /*
-        * pdev is the first thing to be set on probe and the last thing
-        * to be cleared here. If it's NULL then there is no cleanup.
-        */
-       if (!pdev || !vsoc_dev.dev)
-               return;
-       dev_info(&pdev->dev, "remove_device\n");
-       if (vsoc_dev.regions_data) {
-               for (i = 0; i < vsoc_dev.layout->region_count; ++i) {
-                       if (vsoc_dev.regions_data[i].device_created) {
-                               device_destroy(vsoc_dev.class,
-                                              MKDEV(vsoc_dev.major, i));
-                               vsoc_dev.regions_data[i].device_created = false;
-                       }
-                       if (vsoc_dev.regions_data[i].irq_requested)
-                               free_irq(vsoc_dev.msix_entries[i].vector, NULL);
-                       vsoc_dev.regions_data[i].irq_requested = false;
-               }
-               kfree(vsoc_dev.regions_data);
-               vsoc_dev.regions_data = NULL;
-       }
-       if (vsoc_dev.msix_enabled) {
-               pci_disable_msix(pdev);
-               vsoc_dev.msix_enabled = false;
-       }
-       kfree(vsoc_dev.msix_entries);
-       vsoc_dev.msix_entries = NULL;
-       vsoc_dev.regions = NULL;
-       if (vsoc_dev.class_added) {
-               class_destroy(vsoc_dev.class);
-               vsoc_dev.class_added = false;
-       }
-       if (vsoc_dev.cdev_added) {
-               cdev_del(&vsoc_dev.cdev);
-               vsoc_dev.cdev_added = false;
-       }
-       if (vsoc_dev.major && vsoc_dev.layout) {
-               unregister_chrdev_region(MKDEV(vsoc_dev.major, 0),
-                                        vsoc_dev.layout->region_count);
-               vsoc_dev.major = 0;
-       }
-       vsoc_dev.layout = NULL;
-       if (vsoc_dev.kernel_mapped_shm) {
-               pci_iounmap(pdev, vsoc_dev.kernel_mapped_shm);
-               vsoc_dev.kernel_mapped_shm = NULL;
-       }
-       if (vsoc_dev.regs) {
-               pci_iounmap(pdev, vsoc_dev.regs);
-               vsoc_dev.regs = NULL;
-       }
-       if (vsoc_dev.requested_regions) {
-               pci_release_regions(pdev);
-               vsoc_dev.requested_regions = false;
-       }
-       if (vsoc_dev.enabled_device) {
-               pci_disable_device(pdev);
-               vsoc_dev.enabled_device = false;
-       }
-       /* Do this last: it indicates that the device is not initialized. */
-       vsoc_dev.dev = NULL;
-}
-
-static void __exit vsoc_cleanup_module(void)
-{
-       vsoc_remove_device(vsoc_dev.dev);
-       pci_unregister_driver(&vsoc_pci_driver);
-}
-
-static int __init vsoc_init_module(void)
-{
-       int err = -ENOMEM;
-
-       INIT_LIST_HEAD(&vsoc_dev.permissions);
-       mutex_init(&vsoc_dev.mtx);
-
-       err = pci_register_driver(&vsoc_pci_driver);
-       if (err < 0)
-               return err;
-       return 0;
-}
-
-static int vsoc_open(struct inode *inode, struct file *filp)
-{
-       /* Can't use vsoc_validate_filep because filp is still incomplete */
-       int ret = vsoc_validate_inode(inode);
-
-       if (ret)
-               return ret;
-       filp->private_data =
-               kzalloc(sizeof(struct vsoc_private_data), GFP_KERNEL);
-       if (!filp->private_data)
-               return -ENOMEM;
-       return 0;
-}
-
-static int vsoc_release(struct inode *inode, struct file *filp)
-{
-       struct vsoc_private_data *private_data = NULL;
-       struct fd_scoped_permission_node *node = NULL;
-       struct vsoc_device_region *owner_region_p = NULL;
-       int retval = vsoc_validate_filep(filp);
-
-       if (retval)
-               return retval;
-       private_data = (struct vsoc_private_data *)filp->private_data;
-       if (!private_data)
-               return 0;
-
-       node = private_data->fd_scoped_permission_node;
-       if (node) {
-               owner_region_p = vsoc_region_from_inode(inode);
-               if (owner_region_p->managed_by != VSOC_REGION_WHOLE) {
-                       owner_region_p =
-                           &vsoc_dev.regions[owner_region_p->managed_by];
-               }
-               do_destroy_fd_scoped_permission_node(owner_region_p, node);
-               private_data->fd_scoped_permission_node = NULL;
-       }
-       kfree(private_data);
-       filp->private_data = NULL;
-
-       return 0;
-}
-
-/*
- * Returns the device relative offset and length of the area specified by the
- * fd scoped permission. If there is no fd scoped permission set, a default
- * permission covering the entire region is assumed, unless the region is owned
- * by another one, in which case the default is a permission with zero size.
- */
-static ssize_t vsoc_get_area(struct file *filp, __u32 *area_offset)
-{
-       __u32 off = 0;
-       ssize_t length = 0;
-       struct vsoc_device_region *region_p;
-       struct fd_scoped_permission *perm;
-
-       region_p = vsoc_region_from_filep(filp);
-       off = region_p->region_begin_offset;
-       perm = &((struct vsoc_private_data *)filp->private_data)->
-               fd_scoped_permission_node->permission;
-       if (perm) {
-               off += perm->begin_offset;
-               length = perm->end_offset - perm->begin_offset;
-       } else if (region_p->managed_by == VSOC_REGION_WHOLE) {
-               /* No permission set and the regions is not owned by another,
-                * default to full region access.
-                */
-               length = vsoc_device_region_size(region_p);
-       } else {
-               /* return zero length, access is denied. */
-               length = 0;
-       }
-       if (area_offset)
-               *area_offset = off;
-       return length;
-}
-
-static int vsoc_mmap(struct file *filp, struct vm_area_struct *vma)
-{
-       unsigned long len = vma->vm_end - vma->vm_start;
-       __u32 area_off;
-       phys_addr_t mem_off;
-       ssize_t area_len;
-       int retval = vsoc_validate_filep(filp);
-
-       if (retval)
-               return retval;
-       area_len = vsoc_get_area(filp, &area_off);
-       /* Add the requested offset */
-       area_off += (vma->vm_pgoff << PAGE_SHIFT);
-       area_len -= (vma->vm_pgoff << PAGE_SHIFT);
-       if (area_len < len)
-               return -EINVAL;
-       vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
-       mem_off = shm_off_to_phys_addr(area_off);
-       if (io_remap_pfn_range(vma, vma->vm_start, mem_off >> PAGE_SHIFT,
-                              len, vma->vm_page_prot))
-               return -EAGAIN;
-       return 0;
-}
-
-module_init(vsoc_init_module);
-module_exit(vsoc_cleanup_module);
-
-MODULE_LICENSE("GPL");
-MODULE_AUTHOR("Greg Hartman <ghartman@google.com>");
-MODULE_DESCRIPTION("VSoC interpretation of QEmu's ivshmem device");
-MODULE_VERSION("1.0");
diff --git a/drivers/staging/greybus/audio_manager.c b/drivers/staging/greybus/audio_manager.c

index 9b19ea9d3fa144f9c81a8f3dc558cdad2d8b2a87..9a3f7c034ab49e7f141408f35f225798dc395f7c 100644 (file)
--- a/drivers/staging/greybus/audio_manager.c
+++ b/drivers/staging/greybus/audio_manager.c
@@ -92,8 +92,8 @@ void gb_audio_manager_remove_all(void)
  
         list_for_each_entry_safe(module, next, &modules_list, list) {
                 list_del(&module->list);
-               kobject_put(&module->kobj);
                 ida_simple_remove(&module_id, module->id);
+               kobject_put(&module->kobj);
         }
  
         is_empty = list_empty(&modules_list);
diff --git a/drivers/staging/rtl8188eu/os_dep/ioctl_linux.c b/drivers/staging/rtl8188eu/os_dep/ioctl_linux.c

index 9b6ea86d1dcfa6d0c16b23b1fb870292f9e68ee9..ba53959e1303b8f39778f1245c97ce0d5c0563bc 100644 (file)
--- a/drivers/staging/rtl8188eu/os_dep/ioctl_linux.c
+++ b/drivers/staging/rtl8188eu/os_dep/ioctl_linux.c
@@ -2009,21 +2009,16 @@ static int wpa_supplicant_ioctl(struct net_device *dev, struct iw_point *p)
         struct ieee_param *param;
         uint ret = 0;
  
-       if (p->length < sizeof(struct ieee_param) || !p->pointer) {
-               ret = -EINVAL;
-               goto out;
-       }
+       if (!p->pointer || p->length != sizeof(struct ieee_param))
+               return -EINVAL;
  
         param = (struct ieee_param *)rtw_malloc(p->length);
-       if (!param) {
-               ret = -ENOMEM;
-               goto out;
-       }
+       if (!param)
+               return -ENOMEM;
  
         if (copy_from_user(param, p->pointer, p->length)) {
                 kfree(param);
-               ret = -EFAULT;
-               goto out;
+               return -EFAULT;
         }
  
         switch (param->cmd) {
@@ -2054,9 +2049,6 @@ static int wpa_supplicant_ioctl(struct net_device *dev, struct iw_point *p)
                 ret = -EFAULT;
  
         kfree(param);
-
-out:
-
         return ret;
  }
  
@@ -2791,26 +2783,19 @@ static int rtw_hostapd_ioctl(struct net_device *dev, struct iw_point *p)
         * so, we just check hw_init_completed
         */
  
-       if (!padapter->hw_init_completed) {
-               ret = -EPERM;
-               goto out;
-       }
+       if (!padapter->hw_init_completed)
+               return -EPERM;
  
-       if (!p->pointer) {
-               ret = -EINVAL;
-               goto out;
-       }
+       if (!p->pointer || p->length != sizeof(struct ieee_param))
+               return -EINVAL;
  
         param = (struct ieee_param *)rtw_malloc(p->length);
-       if (!param) {
-               ret = -ENOMEM;
-               goto out;
-       }
+       if (!param)
+               return -ENOMEM;
  
         if (copy_from_user(param, p->pointer, p->length)) {
                 kfree(param);
-               ret = -EFAULT;
-               goto out;
+               return -EFAULT;
         }
  
         switch (param->cmd) {
@@ -2865,7 +2850,6 @@ static int rtw_hostapd_ioctl(struct net_device *dev, struct iw_point *p)
         if (ret == 0 && copy_to_user(p->pointer, param, p->length))
                 ret = -EFAULT;
         kfree(param);
-out:
         return ret;
  }
  #endif
diff --git a/drivers/staging/rtl8723bs/hal/rtl8723bs_xmit.c b/drivers/staging/rtl8723bs/hal/rtl8723bs_xmit.c

index b44e902ed338cef7cd523bdde3b89e7341aec980..b6d56cfb0a190af048c0da4ef2501b02c048e241 100644 (file)
--- a/drivers/staging/rtl8723bs/hal/rtl8723bs_xmit.c
+++ b/drivers/staging/rtl8723bs/hal/rtl8723bs_xmit.c
@@ -476,14 +476,13 @@ int rtl8723bs_xmit_thread(void *context)
         s32 ret;
         struct adapter *padapter;
         struct xmit_priv *pxmitpriv;
-       u8 thread_name[20] = "RTWHALXT";
-
+       u8 thread_name[20];
  
         ret = _SUCCESS;
         padapter = context;
         pxmitpriv = &padapter->xmitpriv;
  
-       rtw_sprintf(thread_name, 20, "%s-"ADPT_FMT, thread_name, ADPT_ARG(padapter));
+       rtw_sprintf(thread_name, 20, "RTWHALXT-" ADPT_FMT, ADPT_ARG(padapter));
         thread_enter(thread_name);
  
         DBG_871X("start "FUNC_ADPT_FMT"\n", FUNC_ADPT_ARG(padapter));
diff --git a/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c b/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c

index db6528a01229d6789152e17cd1c22af672d38eb7..9b9038e7deb1090c7bee57a4c299abaa71dcf517 100644 (file)
--- a/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c
+++ b/drivers/staging/rtl8723bs/os_dep/ioctl_linux.c
@@ -3373,21 +3373,16 @@ static int wpa_supplicant_ioctl(struct net_device *dev, struct iw_point *p)
  
         /* down(&ieee->wx_sem); */
  
-       if (p->length < sizeof(struct ieee_param) || !p->pointer) {
-               ret = -EINVAL;
-               goto out;
-       }
+       if (!p->pointer || p->length != sizeof(struct ieee_param))
+               return -EINVAL;
  
         param = rtw_malloc(p->length);
-       if (param == NULL) {
-               ret = -ENOMEM;
-               goto out;
-       }
+       if (param == NULL)
+               return -ENOMEM;
  
         if (copy_from_user(param, p->pointer, p->length)) {
                 kfree(param);
-               ret = -EFAULT;
-               goto out;
+               return -EFAULT;
         }
  
         switch (param->cmd) {
@@ -3421,12 +3416,8 @@ static int wpa_supplicant_ioctl(struct net_device *dev, struct iw_point *p)
  
         kfree(param);
  
-out:
-
         /* up(&ieee->wx_sem); */
-
         return ret;
-
  }
  
  static int rtw_set_encryption(struct net_device *dev, struct ieee_param *param, u32 param_len)
@@ -4200,28 +4191,19 @@ static int rtw_hostapd_ioctl(struct net_device *dev, struct iw_point *p)
         * so, we just check hw_init_completed
         */
  
-       if (!padapter->hw_init_completed) {
-               ret = -EPERM;
-               goto out;
-       }
-
+       if (!padapter->hw_init_completed)
+               return -EPERM;
  
-       /* if (p->length < sizeof(struct ieee_param) || !p->pointer) { */
-       if (!p->pointer) {
-               ret = -EINVAL;
-               goto out;
-       }
+       if (!p->pointer || p->length != sizeof(*param))
+               return -EINVAL;
  
         param = rtw_malloc(p->length);
-       if (param == NULL) {
-               ret = -ENOMEM;
-               goto out;
-       }
+       if (param == NULL)
+               return -ENOMEM;
  
         if (copy_from_user(param, p->pointer, p->length)) {
                 kfree(param);
-               ret = -EFAULT;
-               goto out;
+               return -EFAULT;
         }
  
         /* DBG_871X("%s, cmd =%d\n", __func__, param->cmd); */
@@ -4321,13 +4303,8 @@ static int rtw_hostapd_ioctl(struct net_device *dev, struct iw_point *p)
         if (ret == 0 && copy_to_user(p->pointer, param, p->length))
                 ret = -EFAULT;
  
-
         kfree(param);
-
-out:
-
         return ret;
-
  }
  
  static int rtw_wx_set_priv(struct net_device *dev,
diff --git a/drivers/staging/vt6656/dpc.c b/drivers/staging/vt6656/dpc.c

index 821aae8ca402fa3cac7ed2bca2f93fc3cdd62961..a0b60e7d1086731c6647777f493d9dc43fb6bfb1 100644 (file)
--- a/drivers/staging/vt6656/dpc.c
+++ b/drivers/staging/vt6656/dpc.c
@@ -98,7 +98,7 @@ int vnt_rx_data(struct vnt_private *priv, struct vnt_rcb *ptr_rcb,
  
         vnt_rf_rssi_to_dbm(priv, tail->rssi, &rx_dbm);
  
-       priv->bb_pre_ed_rssi = (u8)rx_dbm + 1;
+       priv->bb_pre_ed_rssi = (u8)-rx_dbm + 1;
         priv->current_rssi = priv->bb_pre_ed_rssi;
  
         skb_pull(skb, sizeof(*head));
diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c

index b94ed4e30770688891a407d1c9dba88d3110f3b8..09e55ea0bf5d5dc6af5c2010940d1aa0504bfae5 100644 (file)
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -1165,9 +1165,7 @@ int iscsit_setup_scsi_cmd(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
                 hdr->cmdsn, be32_to_cpu(hdr->data_length), payload_length,
                 conn->cid);
  
-       if (target_get_sess_cmd(&cmd->se_cmd, true) < 0)
-               return iscsit_add_reject_cmd(cmd,
-                               ISCSI_REASON_WAITING_FOR_LOGOUT, buf);
+       target_get_sess_cmd(&cmd->se_cmd, true);
  
         cmd->sense_reason = transport_lookup_cmd_lun(&cmd->se_cmd,
                                                      scsilun_to_int(&hdr->lun));
@@ -2004,9 +2002,7 @@ iscsit_handle_task_mgt_cmd(struct iscsi_conn *conn, struct iscsi_cmd *cmd,
                               conn->sess->se_sess, 0, DMA_NONE,
                               TCM_SIMPLE_TAG, cmd->sense_buffer + 2);
  
-       if (target_get_sess_cmd(&cmd->se_cmd, true) < 0)
-               return iscsit_add_reject_cmd(cmd,
-                               ISCSI_REASON_WAITING_FOR_LOGOUT, buf);
+       target_get_sess_cmd(&cmd->se_cmd, true);
  
         /*
          * TASK_REASSIGN for ERL=2 / connection stays inside of
@@ -4149,6 +4145,9 @@ int iscsit_close_connection(
         iscsit_stop_nopin_response_timer(conn);
         iscsit_stop_nopin_timer(conn);
  
+       if (conn->conn_transport->iscsit_wait_conn)
+               conn->conn_transport->iscsit_wait_conn(conn);
+
         /*
          * During Connection recovery drop unacknowledged out of order
          * commands for this connection, and prepare the other commands
@@ -4231,11 +4230,6 @@ int iscsit_close_connection(
          * must wait until they have completed.
          */
         iscsit_check_conn_usage_count(conn);
-       target_sess_cmd_list_set_waiting(sess->se_sess);
-       target_wait_for_sess_cmds(sess->se_sess);
-
-       if (conn->conn_transport->iscsit_wait_conn)
-               conn->conn_transport->iscsit_wait_conn(conn);
  
         ahash_request_free(conn->conn_tx_hash);
         if (conn->conn_rx_hash) {
diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c

index ea482d4b1f00e8f03b36d8fc280f319ee03f7f4d..0ae9e60fc4d59b0056cdecfd1cbcc8391dd684f0 100644 (file)
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -666,6 +666,11 @@ static int transport_cmd_check_stop_to_fabric(struct se_cmd *cmd)
  
         target_remove_from_state_list(cmd);
  
+       /*
+        * Clear struct se_cmd->se_lun before the handoff to FE.
+        */
+       cmd->se_lun = NULL;
+
         spin_lock_irqsave(&cmd->t_state_lock, flags);
         /*
          * Determine if frontend context caller is requesting the stopping of
@@ -693,6 +698,17 @@ static int transport_cmd_check_stop_to_fabric(struct se_cmd *cmd)
         return cmd->se_tfo->check_stop_free(cmd);
  }
  
+static void transport_lun_remove_cmd(struct se_cmd *cmd)
+{
+       struct se_lun *lun = cmd->se_lun;
+
+       if (!lun)
+               return;
+
+       if (cmpxchg(&cmd->lun_ref_active, true, false))
+               percpu_ref_put(&lun->lun_ref);
+}
+
  static void target_complete_failure_work(struct work_struct *work)
  {
         struct se_cmd *cmd = container_of(work, struct se_cmd, work);
@@ -783,6 +799,8 @@ static void target_handle_abort(struct se_cmd *cmd)
  
         WARN_ON_ONCE(kref_read(&cmd->cmd_kref) == 0);
  
+       transport_lun_remove_cmd(cmd);
+
         transport_cmd_check_stop_to_fabric(cmd);
  }
  
@@ -1708,6 +1726,7 @@ static void target_complete_tmr_failure(struct work_struct *work)
         se_cmd->se_tmr_req->response = TMR_LUN_DOES_NOT_EXIST;
         se_cmd->se_tfo->queue_tm_rsp(se_cmd);
  
+       transport_lun_remove_cmd(se_cmd);
         transport_cmd_check_stop_to_fabric(se_cmd);
  }
  
@@ -1898,6 +1917,7 @@ void transport_generic_request_failure(struct se_cmd *cmd,
                 goto queue_full;
  
  check_stop:
+       transport_lun_remove_cmd(cmd);
         transport_cmd_check_stop_to_fabric(cmd);
         return;
  
@@ -2195,6 +2215,7 @@ queue_status:
                 transport_handle_queue_full(cmd, cmd->se_dev, ret, false);
                 return;
         }
+       transport_lun_remove_cmd(cmd);
         transport_cmd_check_stop_to_fabric(cmd);
  }
  
@@ -2289,6 +2310,7 @@ static void target_complete_ok_work(struct work_struct *work)
                 if (ret)
                         goto queue_full;
  
+               transport_lun_remove_cmd(cmd);
                 transport_cmd_check_stop_to_fabric(cmd);
                 return;
         }
@@ -2314,6 +2336,7 @@ static void target_complete_ok_work(struct work_struct *work)
                         if (ret)
                                 goto queue_full;
  
+                       transport_lun_remove_cmd(cmd);
                         transport_cmd_check_stop_to_fabric(cmd);
                         return;
                 }
@@ -2349,6 +2372,7 @@ queue_rsp:
                         if (ret)
                                 goto queue_full;
  
+                       transport_lun_remove_cmd(cmd);
                         transport_cmd_check_stop_to_fabric(cmd);
                         return;
                 }
@@ -2384,6 +2408,7 @@ queue_status:
                 break;
         }
  
+       transport_lun_remove_cmd(cmd);
         transport_cmd_check_stop_to_fabric(cmd);
         return;
  
@@ -2710,6 +2735,9 @@ int transport_generic_free_cmd(struct se_cmd *cmd, int wait_for_tasks)
                  */
                 if (cmd->state_active)
                         target_remove_from_state_list(cmd);
+
+               if (cmd->se_lun)
+                       transport_lun_remove_cmd(cmd);
         }
         if (aborted)
                 cmd->free_compl = &compl;
@@ -2781,9 +2809,6 @@ static void target_release_cmd_kref(struct kref *kref)
         struct completion *abrt_compl = se_cmd->abrt_compl;
         unsigned long flags;
  
-       if (se_cmd->lun_ref_active)
-               percpu_ref_put(&se_cmd->se_lun->lun_ref);
-
         if (se_sess) {
                 spin_lock_irqsave(&se_sess->sess_cmd_lock, flags);
                 list_del_init(&se_cmd->se_cmd_list);
diff --git a/drivers/tee/amdtee/Kconfig b/drivers/tee/amdtee/Kconfig

index 4e32b6413b41ffefdef8d4abaf93d3a7f5a93097..191f9715fa9afcb3ee98fb2ee509e64a9a448b94 100644 (file)
--- a/drivers/tee/amdtee/Kconfig
+++ b/drivers/tee/amdtee/Kconfig
@@ -3,6 +3,6 @@
  config AMDTEE
         tristate "AMD-TEE"
         default m
-       depends on CRYPTO_DEV_SP_PSP
+       depends on CRYPTO_DEV_SP_PSP && CRYPTO_DEV_CCP_DD
         help
           This implements AMD's Trusted Execution Environment (TEE) driver.
diff --git a/drivers/thunderbolt/switch.c b/drivers/thunderbolt/switch.c

index ad5479f211744ffee22c0062b6013956d385c2be..7d6ecc3425081cce72e79b475ac6a7f39cf53290 100644 (file)
--- a/drivers/thunderbolt/switch.c
+++ b/drivers/thunderbolt/switch.c
@@ -348,6 +348,12 @@ out:
         return ret;
  }
  
+static int tb_switch_nvm_no_read(void *priv, unsigned int offset, void *val,
+                                size_t bytes)
+{
+       return -EPERM;
+}
+
  static int tb_switch_nvm_write(void *priv, unsigned int offset, void *val,
                                size_t bytes)
  {
@@ -393,6 +399,7 @@ static struct nvmem_device *register_nvmem(struct tb_switch *sw, int id,
                 config.read_only = true;
         } else {
                 config.name = "nvm_non_active";
+               config.reg_read = tb_switch_nvm_no_read;
                 config.reg_write = tb_switch_nvm_write;
                 config.root_only = true;
         }
diff --git a/drivers/tty/serdev/serdev-ttyport.c b/drivers/tty/serdev/serdev-ttyport.c

index d1cdd2ab8b4c00b9665f2ca4cb62107ee6e78187..d367803e2044fb9e8c4f7837866fd09260f343e9 100644 (file)
--- a/drivers/tty/serdev/serdev-ttyport.c
+++ b/drivers/tty/serdev/serdev-ttyport.c
@@ -265,7 +265,6 @@ struct device *serdev_tty_port_register(struct tty_port *port,
                                         struct device *parent,
                                         struct tty_driver *drv, int idx)
  {
-       const struct tty_port_client_operations *old_ops;
         struct serdev_controller *ctrl;
         struct serport *serport;
         int ret;
@@ -284,7 +283,6 @@ struct device *serdev_tty_port_register(struct tty_port *port,
  
         ctrl->ops = &ctrl_ops;
  
-       old_ops = port->client_ops;
         port->client_ops = &client_ops;
         port->client_data = ctrl;
  
@@ -297,7 +295,7 @@ struct device *serdev_tty_port_register(struct tty_port *port,
  
  err_reset_data:
         port->client_data = NULL;
-       port->client_ops = old_ops;
+       port->client_ops = &tty_port_default_client_ops;
         serdev_controller_put(ctrl);
  
         return ERR_PTR(ret);
@@ -312,8 +310,8 @@ int serdev_tty_port_unregister(struct tty_port *port)
                 return -ENODEV;
  
         serdev_controller_remove(ctrl);
-       port->client_ops = NULL;
         port->client_data = NULL;
+       port->client_ops = &tty_port_default_client_ops;
         serdev_controller_put(ctrl);
  
         return 0;
diff --git a/drivers/tty/serial/8250/8250_aspeed_vuart.c b/drivers/tty/serial/8250/8250_aspeed_vuart.c

index d657aa14c3e4b588eccbfc62e899ec31b1145b1a..c33e02cbde9303110a094f8d63523197e22f9d5d 100644 (file)
--- a/drivers/tty/serial/8250/8250_aspeed_vuart.c
+++ b/drivers/tty/serial/8250/8250_aspeed_vuart.c
@@ -446,7 +446,6 @@ static int aspeed_vuart_probe(struct platform_device *pdev)
                 port.port.line = rc;
  
         port.port.irq = irq_of_parse_and_map(np, 0);
-       port.port.irqflags = IRQF_SHARED;
         port.port.handle_irq = aspeed_vuart_handle_irq;
         port.port.iotype = UPIO_MEM;
         port.port.type = PORT_16550A;
diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c

index 0894a22fd70280eee749cb14f2101c51b051f9c8..f2a33c9082a681c3520f102fcc95d755724055e5 100644 (file)
--- a/drivers/tty/serial/8250/8250_core.c
+++ b/drivers/tty/serial/8250/8250_core.c
@@ -174,7 +174,7 @@ static int serial_link_irq_chain(struct uart_8250_port *up)
         struct hlist_head *h;
         struct hlist_node *n;
         struct irq_info *i;
-       int ret, irq_flags = up->port.flags & UPF_SHARE_IRQ ? IRQF_SHARED : 0;
+       int ret;
  
         mutex_lock(&hash_mutex);
  
@@ -209,9 +209,8 @@ static int serial_link_irq_chain(struct uart_8250_port *up)
                 INIT_LIST_HEAD(&up->list);
                 i->head = &up->list;
                 spin_unlock_irq(&i->lock);
-               irq_flags |= up->port.irqflags;
                 ret = request_irq(up->port.irq, serial8250_interrupt,
-                                 irq_flags, up->port.name, i);
+                                 up->port.irqflags, up->port.name, i);
                 if (ret < 0)
                         serial_do_unlink(i, up);
         }
diff --git a/drivers/tty/serial/8250/8250_of.c b/drivers/tty/serial/8250/8250_of.c

index 531ad67395e0a5488dd662a0b29f3b7293761c4a..f6687756ec5e1efd797ea7097facfb07a6466c20 100644 (file)
--- a/drivers/tty/serial/8250/8250_of.c
+++ b/drivers/tty/serial/8250/8250_of.c
@@ -202,7 +202,6 @@ static int of_platform_serial_setup(struct platform_device *ofdev,
  
         port->type = type;
         port->uartclk = clk;
-       port->irqflags |= IRQF_SHARED;
  
         if (of_property_read_bool(np, "no-loopback-test"))
                 port->flags |= UPF_SKIP_TEST;
diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c

index 430e3467aff7bf49c221401a39b2968fb8c0e7eb..0325f2e53b74507eff7b9156193e8e7dbdbcaced 100644 (file)
--- a/drivers/tty/serial/8250/8250_port.c
+++ b/drivers/tty/serial/8250/8250_port.c
@@ -2177,6 +2177,10 @@ int serial8250_do_startup(struct uart_port *port)
                 }
         }
  
+       /* Check if we need to have shared IRQs */
+       if (port->irq && (up->port.flags & UPF_SHARE_IRQ))
+               up->port.irqflags |= IRQF_SHARED;
+
         if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) {
                 unsigned char iir1;
                 /*
diff --git a/drivers/tty/serial/ar933x_uart.c b/drivers/tty/serial/ar933x_uart.c

index 3bdd56a1021b26d6e74ff98ad6f0e25f25f5de08..ea12f10610b64dd46062735fc4e682213c407167 100644 (file)
--- a/drivers/tty/serial/ar933x_uart.c
+++ b/drivers/tty/serial/ar933x_uart.c
@@ -286,6 +286,10 @@ static void ar933x_uart_set_termios(struct uart_port *port,
         ar933x_uart_rmw_set(up, AR933X_UART_CS_REG,
                             AR933X_UART_CS_HOST_INT_EN);
  
+       /* enable RX and TX ready overide */
+       ar933x_uart_rmw_set(up, AR933X_UART_CS_REG,
+               AR933X_UART_CS_TX_READY_ORIDE | AR933X_UART_CS_RX_READY_ORIDE);
+
         /* reenable the UART */
         ar933x_uart_rmw(up, AR933X_UART_CS_REG,
                         AR933X_UART_CS_IF_MODE_M << AR933X_UART_CS_IF_MODE_S,
@@ -418,6 +422,10 @@ static int ar933x_uart_startup(struct uart_port *port)
         ar933x_uart_rmw_set(up, AR933X_UART_CS_REG,
                             AR933X_UART_CS_HOST_INT_EN);
  
+       /* enable RX and TX ready overide */
+       ar933x_uart_rmw_set(up, AR933X_UART_CS_REG,
+               AR933X_UART_CS_TX_READY_ORIDE | AR933X_UART_CS_RX_READY_ORIDE);
+
         /* Enable RX interrupts */
         up->ier = AR933X_UART_INT_RX_VALID;
         ar933x_uart_write(up, AR933X_UART_INT_EN_REG, up->ier);
diff --git a/drivers/tty/serial/atmel_serial.c b/drivers/tty/serial/atmel_serial.c

index c15c398c88a938bd49494f959a3cb8da5b983620..a39c87a7c2e180923ce54ff5a2b03f5c641e86f5 100644 (file)
--- a/drivers/tty/serial/atmel_serial.c
+++ b/drivers/tty/serial/atmel_serial.c
@@ -570,7 +570,8 @@ static void atmel_stop_tx(struct uart_port *port)
         atmel_uart_writel(port, ATMEL_US_IDR, atmel_port->tx_done_mask);
  
         if (atmel_uart_is_half_duplex(port))
-               atmel_start_rx(port);
+               if (!atomic_read(&atmel_port->tasklet_shutdown))
+                       atmel_start_rx(port);
  
  }
  
diff --git a/drivers/tty/serial/cpm_uart/cpm_uart_core.c b/drivers/tty/serial/cpm_uart/cpm_uart_core.c

index 19d5a4cf29a62e1685d1051a3cb4ada1d6fdcf7f..d4b81b06e0cbf79b9e52413d528023133a1bd053 100644 (file)
--- a/drivers/tty/serial/cpm_uart/cpm_uart_core.c
+++ b/drivers/tty/serial/cpm_uart/cpm_uart_core.c
@@ -1373,6 +1373,7 @@ static struct console cpm_scc_uart_console = {
  
  static int __init cpm_uart_console_init(void)
  {
+       cpm_muram_init();
         register_console(&cpm_scc_uart_console);
         return 0;
  }
diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c

index 0c6c63166250d73890b020e67ac6653d3c7ae114..d337782b36486c86db6084c24975dbe0ffbf8bdc 100644 (file)
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -599,7 +599,7 @@ static void imx_uart_dma_tx(struct imx_port *sport)
  
         sport->tx_bytes = uart_circ_chars_pending(xmit);
  
-       if (xmit->tail < xmit->head) {
+       if (xmit->tail < xmit->head || xmit->head == 0) {
                 sport->dma_tx_nents = 1;
                 sg_init_one(sgl, xmit->buf + xmit->tail, sport->tx_bytes);
         } else {
diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c

index 191abb18fc2a7e9208f69c1896c03106b716c270..0bd1684cabb390dff5ecd854e9553690408fac6b 100644 (file)
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -129,6 +129,7 @@ static int handle_rx_console(struct uart_port *uport, u32 bytes, bool drop);
  static int handle_rx_uart(struct uart_port *uport, u32 bytes, bool drop);
  static unsigned int qcom_geni_serial_tx_empty(struct uart_port *port);
  static void qcom_geni_serial_stop_rx(struct uart_port *uport);
+static void qcom_geni_serial_handle_rx(struct uart_port *uport, bool drop);
  
  static const unsigned long root_freq[] = {7372800, 14745600, 19200000, 29491200,
                                         32000000, 48000000, 64000000, 80000000,
@@ -599,7 +600,7 @@ static void qcom_geni_serial_stop_rx(struct uart_port *uport)
         u32 irq_en;
         u32 status;
         struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
-       u32 irq_clear = S_CMD_DONE_EN;
+       u32 s_irq_status;
  
         irq_en = readl(uport->membase + SE_GENI_S_IRQ_EN);
         irq_en &= ~(S_RX_FIFO_WATERMARK_EN | S_RX_FIFO_LAST_EN);
@@ -615,10 +616,19 @@ static void qcom_geni_serial_stop_rx(struct uart_port *uport)
                 return;
  
         geni_se_cancel_s_cmd(&port->se);
-       qcom_geni_serial_poll_bit(uport, SE_GENI_S_CMD_CTRL_REG,
-                                       S_GENI_CMD_CANCEL, false);
+       qcom_geni_serial_poll_bit(uport, SE_GENI_S_IRQ_STATUS,
+                                       S_CMD_CANCEL_EN, true);
+       /*
+        * If timeout occurs secondary engine remains active
+        * and Abort sequence is executed.
+        */
+       s_irq_status = readl(uport->membase + SE_GENI_S_IRQ_STATUS);
+       /* Flush the Rx buffer */
+       if (s_irq_status & S_RX_FIFO_LAST_EN)
+               qcom_geni_serial_handle_rx(uport, true);
+       writel(s_irq_status, uport->membase + SE_GENI_S_IRQ_CLEAR);
+
         status = readl(uport->membase + SE_GENI_STATUS);
-       writel(irq_clear, uport->membase + SE_GENI_S_IRQ_CLEAR);
         if (status & S_GENI_CMD_ACTIVE)
                 qcom_geni_serial_abort_rx(uport);
  }
diff --git a/drivers/tty/serial/serial-tegra.c b/drivers/tty/serial/serial-tegra.c

index 33034b852a51fa52238d7d86c0ea7329170e967f..8de8bac9c6c7200fd84530645d79d4d7b6b80eb7 100644 (file)
--- a/drivers/tty/serial/serial-tegra.c
+++ b/drivers/tty/serial/serial-tegra.c
@@ -692,11 +692,22 @@ static void tegra_uart_copy_rx_to_tty(struct tegra_uart_port *tup,
                                    count, DMA_TO_DEVICE);
  }
  
+static void do_handle_rx_pio(struct tegra_uart_port *tup)
+{
+       struct tty_struct *tty = tty_port_tty_get(&tup->uport.state->port);
+       struct tty_port *port = &tup->uport.state->port;
+
+       tegra_uart_handle_rx_pio(tup, port);
+       if (tty) {
+               tty_flip_buffer_push(port);
+               tty_kref_put(tty);
+       }
+}
+
  static void tegra_uart_rx_buffer_push(struct tegra_uart_port *tup,
                                       unsigned int residue)
  {
         struct tty_port *port = &tup->uport.state->port;
-       struct tty_struct *tty = tty_port_tty_get(port);
         unsigned int count;
  
         async_tx_ack(tup->rx_dma_desc);
@@ -705,11 +716,7 @@ static void tegra_uart_rx_buffer_push(struct tegra_uart_port *tup,
         /* If we are here, DMA is stopped */
         tegra_uart_copy_rx_to_tty(tup, port, count);
  
-       tegra_uart_handle_rx_pio(tup, port);
-       if (tty) {
-               tty_flip_buffer_push(port);
-               tty_kref_put(tty);
-       }
+       do_handle_rx_pio(tup);
  }
  
  static void tegra_uart_rx_dma_complete(void *args)
@@ -749,8 +756,10 @@ static void tegra_uart_terminate_rx_dma(struct tegra_uart_port *tup)
  {
         struct dma_tx_state state;
  
-       if (!tup->rx_dma_active)
+       if (!tup->rx_dma_active) {
+               do_handle_rx_pio(tup);
                 return;
+       }
  
         dmaengine_terminate_all(tup->rx_dma_chan);
         dmaengine_tx_status(tup->rx_dma_chan, tup->rx_cookie, &state);
@@ -816,18 +825,6 @@ static void tegra_uart_handle_modem_signal_change(struct uart_port *u)
                 uart_handle_cts_change(&tup->uport, msr & UART_MSR_CTS);
  }
  
-static void do_handle_rx_pio(struct tegra_uart_port *tup)
-{
-       struct tty_struct *tty = tty_port_tty_get(&tup->uport.state->port);
-       struct tty_port *port = &tup->uport.state->port;
-
-       tegra_uart_handle_rx_pio(tup, port);
-       if (tty) {
-               tty_flip_buffer_push(port);
-               tty_kref_put(tty);
-       }
-}
-
  static irqreturn_t tegra_uart_isr(int irq, void *data)
  {
         struct tegra_uart_port *tup = data;
diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c

index 044c3cbdcfa40664497d13bd00e607584eff99c7..ea80bf872f543c2883fd2fb86e66805f4f44b639 100644 (file)
--- a/drivers/tty/tty_port.c
+++ b/drivers/tty/tty_port.c
@@ -52,10 +52,11 @@ static void tty_port_default_wakeup(struct tty_port *port)
         }
  }
  
-static const struct tty_port_client_operations default_client_ops = {
+const struct tty_port_client_operations tty_port_default_client_ops = {
         .receive_buf = tty_port_default_receive_buf,
         .write_wakeup = tty_port_default_wakeup,
  };
+EXPORT_SYMBOL_GPL(tty_port_default_client_ops);
  
  void tty_port_init(struct tty_port *port)
  {
@@ -68,7 +69,7 @@ void tty_port_init(struct tty_port *port)
         spin_lock_init(&port->lock);
         port->close_delay = (50 * HZ) / 100;
         port->closing_wait = (3000 * HZ) / 100;
-       port->client_ops = &default_client_ops;
+       port->client_ops = &tty_port_default_client_ops;
         kref_init(&port->kref);
  }
  EXPORT_SYMBOL(tty_port_init);
diff --git a/drivers/tty/vt/selection.c b/drivers/tty/vt/selection.c

index 78732feaf65bc237ccf26fb4685198e957e6a1c1..0c50d7410b31e903139133b96260226f4f7234df 100644 (file)
--- a/drivers/tty/vt/selection.c
+++ b/drivers/tty/vt/selection.c
@@ -16,6 +16,7 @@
  #include <linux/tty.h>
  #include <linux/sched.h>
  #include <linux/mm.h>
+#include <linux/mutex.h>
  #include <linux/slab.h>
  #include <linux/types.h>
  
@@ -29,6 +30,8 @@
  #include <linux/console.h>
  #include <linux/tty_flip.h>
  
+#include <linux/sched/signal.h>
+
  /* Don't take this from <ctype.h>: 011-015 on the screen aren't spaces */
  #define isspace(c)     ((c) == ' ')
  
@@ -43,6 +46,7 @@ static volatile int sel_start = -1;   /* cleared by clear_selection */
  static int sel_end;
  static int sel_buffer_lth;
  static char *sel_buffer;
+static DEFINE_MUTEX(sel_lock);
  
  /* clear_selection, highlight and highlight_pointer can be called
     from interrupt (via scrollback/front) */
@@ -184,7 +188,7 @@ int set_selection_kernel(struct tiocl_selection *v, struct tty_struct *tty)
         char *bp, *obp;
         int i, ps, pe, multiplier;
         u32 c;
-       int mode;
+       int mode, ret = 0;
  
         poke_blanked_console();
  
@@ -210,6 +214,7 @@ int set_selection_kernel(struct tiocl_selection *v, struct tty_struct *tty)
         if (ps > pe)    /* make sel_start <= sel_end */
                 swap(ps, pe);
  
+       mutex_lock(&sel_lock);
         if (sel_cons != vc_cons[fg_console].d) {
                 clear_selection();
                 sel_cons = vc_cons[fg_console].d;
@@ -255,9 +260,10 @@ int set_selection_kernel(struct tiocl_selection *v, struct tty_struct *tty)
                         break;
                 case TIOCL_SELPOINTER:
                         highlight_pointer(pe);
-                       return 0;
+                       goto unlock;
                 default:
-                       return -EINVAL;
+                       ret = -EINVAL;
+                       goto unlock;
         }
  
         /* remove the pointer */
@@ -279,7 +285,7 @@ int set_selection_kernel(struct tiocl_selection *v, struct tty_struct *tty)
         else if (new_sel_start == sel_start)
         {
                 if (new_sel_end == sel_end)     /* no action required */
-                       return 0;
+                       goto unlock;
                 else if (new_sel_end > sel_end) /* extend to right */
                         highlight(sel_end + 2, new_sel_end);
                 else                            /* contract from right */
@@ -307,7 +313,8 @@ int set_selection_kernel(struct tiocl_selection *v, struct tty_struct *tty)
         if (!bp) {
                 printk(KERN_WARNING "selection: kmalloc() failed\n");
                 clear_selection();
-               return -ENOMEM;
+               ret = -ENOMEM;
+               goto unlock;
         }
         kfree(sel_buffer);
         sel_buffer = bp;
@@ -332,7 +339,9 @@ int set_selection_kernel(struct tiocl_selection *v, struct tty_struct *tty)
                 }
         }
         sel_buffer_lth = bp - sel_buffer;
-       return 0;
+unlock:
+       mutex_unlock(&sel_lock);
+       return ret;
  }
  EXPORT_SYMBOL_GPL(set_selection_kernel);
  
@@ -350,6 +359,7 @@ int paste_selection(struct tty_struct *tty)
         unsigned int count;
         struct  tty_ldisc *ld;
         DECLARE_WAITQUEUE(wait, current);
+       int ret = 0;
  
         console_lock();
         poke_blanked_console();
@@ -361,10 +371,17 @@ int paste_selection(struct tty_struct *tty)
         tty_buffer_lock_exclusive(&vc->port);
  
         add_wait_queue(&vc->paste_wait, &wait);
+       mutex_lock(&sel_lock);
         while (sel_buffer && sel_buffer_lth > pasted) {
                 set_current_state(TASK_INTERRUPTIBLE);
+               if (signal_pending(current)) {
+                       ret = -EINTR;
+                       break;
+               }
                 if (tty_throttled(tty)) {
+                       mutex_unlock(&sel_lock);
                         schedule();
+                       mutex_lock(&sel_lock);
                         continue;
                 }
                 __set_current_state(TASK_RUNNING);
@@ -373,11 +390,12 @@ int paste_selection(struct tty_struct *tty)
                                               count);
                 pasted += count;
         }
+       mutex_unlock(&sel_lock);
         remove_wait_queue(&vc->paste_wait, &wait);
         __set_current_state(TASK_RUNNING);
  
         tty_buffer_unlock_exclusive(&vc->port);
         tty_ldisc_deref(ld);
-       return 0;
+       return ret;
  }
  EXPORT_SYMBOL_GPL(paste_selection);
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c

index 35d21cdb60d0b82b0c786c235b7b873409efc774..0cfbb7182b5a592aa1519a033ce92f591066c2bb 100644 (file)
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -936,10 +936,21 @@ static void flush_scrollback(struct vc_data *vc)
         WARN_CONSOLE_UNLOCKED();
  
         set_origin(vc);
-       if (vc->vc_sw->con_flush_scrollback)
+       if (vc->vc_sw->con_flush_scrollback) {
                 vc->vc_sw->con_flush_scrollback(vc);
-       else
+       } else if (con_is_visible(vc)) {
+               /*
+                * When no con_flush_scrollback method is provided then the
+                * legacy way for flushing the scrollback buffer is to use
+                * a side effect of the con_switch method. We do it only on
+                * the foreground console as background consoles have no
+                * scrollback buffers in that case and we obviously don't
+                * want to switch to them.
+                */
+               hide_cursor(vc);
                 vc->vc_sw->con_switch(vc);
+               set_cursor(vc);
+       }
  }
  
  /*
diff --git a/drivers/tty/vt/vt_ioctl.c b/drivers/tty/vt/vt_ioctl.c

index 8b0ed139592f95bf4c42fdfbd88273dec874372e..ee6c91ef1f6cf726b8b50f40128fe1dca8effabd 100644 (file)
--- a/drivers/tty/vt/vt_ioctl.c
+++ b/drivers/tty/vt/vt_ioctl.c
@@ -876,15 +876,20 @@ int vt_ioctl(struct tty_struct *tty,
                         return -EINVAL;
  
                 for (i = 0; i < MAX_NR_CONSOLES; i++) {
+                       struct vc_data *vcp;
+
                         if (!vc_cons[i].d)
                                 continue;
                         console_lock();
-                       if (v.v_vlin)
-                               vc_cons[i].d->vc_scan_lines = v.v_vlin;
-                       if (v.v_clin)
-                               vc_cons[i].d->vc_font.height = v.v_clin;
-                       vc_cons[i].d->vc_resize_user = 1;
-                       vc_resize(vc_cons[i].d, v.v_cols, v.v_rows);
+                       vcp = vc_cons[i].d;
+                       if (vcp) {
+                               if (v.v_vlin)
+                                       vcp->vc_scan_lines = v.v_vlin;
+                               if (v.v_clin)
+                                       vcp->vc_font.height = v.v_clin;
+                               vcp->vc_resize_user = 1;
+                               vc_resize(vcp, v.v_cols, v.v_rows);
+                       }
                         console_unlock();
                 }
                 break;
diff --git a/drivers/usb/core/config.c b/drivers/usb/core/config.c

index 26bc05e48d8a7414121dd348e7dfc06e6916cedc..b7918f6954344321412f550e262b6183c0d2a292 100644 (file)
--- a/drivers/usb/core/config.c
+++ b/drivers/usb/core/config.c
@@ -256,6 +256,7 @@ static int usb_parse_endpoint(struct device *ddev, int cfgno,
                 struct usb_host_interface *ifp, int num_ep,
                 unsigned char *buffer, int size)
  {
+       struct usb_device *udev = to_usb_device(ddev);
         unsigned char *buffer0 = buffer;
         struct usb_endpoint_descriptor *d;
         struct usb_host_endpoint *endpoint;
@@ -297,6 +298,16 @@ static int usb_parse_endpoint(struct device *ddev, int cfgno,
                 goto skip_to_next_endpoint_or_interface_descriptor;
         }
  
+       /* Ignore blacklisted endpoints */
+       if (udev->quirks & USB_QUIRK_ENDPOINT_BLACKLIST) {
+               if (usb_endpoint_is_blacklisted(udev, ifp, d)) {
+                       dev_warn(ddev, "config %d interface %d altsetting %d has a blacklisted endpoint with address 0x%X, skipping\n",
+                                       cfgno, inum, asnum,
+                                       d->bEndpointAddress);
+                       goto skip_to_next_endpoint_or_interface_descriptor;
+               }
+       }
+
         endpoint = &ifp->endpoint[ifp->desc.bNumEndpoints];
         ++ifp->desc.bNumEndpoints;
  
@@ -311,7 +322,7 @@ static int usb_parse_endpoint(struct device *ddev, int cfgno,
         j = 255;
         if (usb_endpoint_xfer_int(d)) {
                 i = 1;
-               switch (to_usb_device(ddev)->speed) {
+               switch (udev->speed) {
                 case USB_SPEED_SUPER_PLUS:
                 case USB_SPEED_SUPER:
                 case USB_SPEED_HIGH:
@@ -332,8 +343,7 @@ static int usb_parse_endpoint(struct device *ddev, int cfgno,
                         /*
                          * This quirk fixes bIntervals reported in ms.
                          */
-                       if (to_usb_device(ddev)->quirks &
-                               USB_QUIRK_LINEAR_FRAME_INTR_BINTERVAL) {
+                       if (udev->quirks & USB_QUIRK_LINEAR_FRAME_INTR_BINTERVAL) {
                                 n = clamp(fls(d->bInterval) + 3, i, j);
                                 i = j = n;
                         }
@@ -341,8 +351,7 @@ static int usb_parse_endpoint(struct device *ddev, int cfgno,
                          * This quirk fixes bIntervals reported in
                          * linear microframes.
                          */
-                       if (to_usb_device(ddev)->quirks &
-                               USB_QUIRK_LINEAR_UFRAME_INTR_BINTERVAL) {
+                       if (udev->quirks & USB_QUIRK_LINEAR_UFRAME_INTR_BINTERVAL) {
                                 n = clamp(fls(d->bInterval), i, j);
                                 i = j = n;
                         }
@@ -359,7 +368,7 @@ static int usb_parse_endpoint(struct device *ddev, int cfgno,
         } else if (usb_endpoint_xfer_isoc(d)) {
                 i = 1;
                 j = 16;
-               switch (to_usb_device(ddev)->speed) {
+               switch (udev->speed) {
                 case USB_SPEED_HIGH:
                         n = 7;          /* 8 ms = 2^(7-1) uframes */
                         break;
@@ -381,8 +390,7 @@ static int usb_parse_endpoint(struct device *ddev, int cfgno,
          * explicitly forbidden by the USB spec.  In an attempt to make
          * them usable, we will try treating them as Interrupt endpoints.
          */
-       if (to_usb_device(ddev)->speed == USB_SPEED_LOW &&
-                       usb_endpoint_xfer_bulk(d)) {
+       if (udev->speed == USB_SPEED_LOW && usb_endpoint_xfer_bulk(d)) {
                 dev_warn(ddev, "config %d interface %d altsetting %d "
                     "endpoint 0x%X is Bulk; changing to Interrupt\n",
                     cfgno, inum, asnum, d->bEndpointAddress);
@@ -406,7 +414,7 @@ static int usb_parse_endpoint(struct device *ddev, int cfgno,
  
         /* Find the highest legal maxpacket size for this endpoint */
         i = 0;          /* additional transactions per microframe */
-       switch (to_usb_device(ddev)->speed) {
+       switch (udev->speed) {
         case USB_SPEED_LOW:
                 maxpacket_maxes = low_speed_maxpacket_maxes;
                 break;
@@ -442,8 +450,7 @@ static int usb_parse_endpoint(struct device *ddev, int cfgno,
          * maxpacket sizes other than 512.  High speed HCDs may not
          * be able to handle that particular bug, so let's warn...
          */
-       if (to_usb_device(ddev)->speed == USB_SPEED_HIGH
-                       && usb_endpoint_xfer_bulk(d)) {
+       if (udev->speed == USB_SPEED_HIGH && usb_endpoint_xfer_bulk(d)) {
                 if (maxp != 512)
                         dev_warn(ddev, "config %d interface %d altsetting %d "
                                 "bulk endpoint 0x%X has invalid maxpacket %d\n",
@@ -452,7 +459,7 @@ static int usb_parse_endpoint(struct device *ddev, int cfgno,
         }
  
         /* Parse a possible SuperSpeed endpoint companion descriptor */
-       if (to_usb_device(ddev)->speed >= USB_SPEED_SUPER)
+       if (udev->speed >= USB_SPEED_SUPER)
                 usb_parse_ss_endpoint_companion(ddev, cfgno,
                                 inum, asnum, endpoint, buffer, size);
  
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c

index 3405b146edc94f3e6fa153a769fcccccb7ee4465..1d212f82c69b45f545de3a0aba130b551b9b6177 100644 (file)
--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -38,7 +38,9 @@
  #include "otg_whitelist.h"
  
  #define USB_VENDOR_GENESYS_LOGIC               0x05e3
+#define USB_VENDOR_SMSC                                0x0424
  #define HUB_QUIRK_CHECK_PORT_AUTOSUSPEND       0x01
+#define HUB_QUIRK_DISABLE_AUTOSUSPEND          0x02
  
  #define USB_TP_TRANSMISSION_DELAY      40      /* ns */
  #define USB_TP_TRANSMISSION_DELAY_MAX  65535   /* ns */
@@ -1217,11 +1219,6 @@ static void hub_activate(struct usb_hub *hub, enum hub_activation_type type)
  #ifdef CONFIG_PM
                         udev->reset_resume = 1;
  #endif
-                       /* Don't set the change_bits when the device
-                        * was powered off.
-                        */
-                       if (test_bit(port1, hub->power_bits))
-                               set_bit(port1, hub->change_bits);
  
                 } else {
                         /* The power session is gone; tell hub_wq */
@@ -1731,6 +1728,10 @@ static void hub_disconnect(struct usb_interface *intf)
         kfree(hub->buffer);
  
         pm_suspend_ignore_children(&intf->dev, false);
+
+       if (hub->quirk_disable_autosuspend)
+               usb_autopm_put_interface(intf);
+
         kref_put(&hub->kref, hub_release);
  }
  
@@ -1863,6 +1864,11 @@ static int hub_probe(struct usb_interface *intf, const struct usb_device_id *id)
         if (id->driver_info & HUB_QUIRK_CHECK_PORT_AUTOSUSPEND)
                 hub->quirk_check_port_auto_suspend = 1;
  
+       if (id->driver_info & HUB_QUIRK_DISABLE_AUTOSUSPEND) {
+               hub->quirk_disable_autosuspend = 1;
+               usb_autopm_get_interface(intf);
+       }
+
         if (hub_configure(hub, &desc->endpoint[0].desc) >= 0)
                 return 0;
  
@@ -5599,6 +5605,10 @@ out_hdev_lock:
  }
  
  static const struct usb_device_id hub_id_table[] = {
+    { .match_flags = USB_DEVICE_ID_MATCH_VENDOR | USB_DEVICE_ID_MATCH_INT_CLASS,
+      .idVendor = USB_VENDOR_SMSC,
+      .bInterfaceClass = USB_CLASS_HUB,
+      .driver_info = HUB_QUIRK_DISABLE_AUTOSUSPEND},
      { .match_flags = USB_DEVICE_ID_MATCH_VENDOR
                         | USB_DEVICE_ID_MATCH_INT_CLASS,
        .idVendor = USB_VENDOR_GENESYS_LOGIC,
diff --git a/drivers/usb/core/hub.h b/drivers/usb/core/hub.h

index a9e24e4b8df146b30bfab2c8857a713ff6da7c97..a97dd1ba964ee5086a57972906fcc2a7271f5cb7 100644 (file)
--- a/drivers/usb/core/hub.h
+++ b/drivers/usb/core/hub.h
@@ -61,6 +61,7 @@ struct usb_hub {
         unsigned                quiescing:1;
         unsigned                disconnected:1;
         unsigned                in_reset:1;
+       unsigned                quirk_disable_autosuspend:1;
  
         unsigned                quirk_check_port_auto_suspend:1;
  
diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c

index 6b6413073584339ba9e271e17ffa63fc4f883aa4..2b24336a72e564661f6f37d1d30e0b3c4abc2070 100644 (file)
--- a/drivers/usb/core/quirks.c
+++ b/drivers/usb/core/quirks.c
@@ -354,6 +354,10 @@ static const struct usb_device_id usb_quirk_list[] = {
         { USB_DEVICE(0x0904, 0x6103), .driver_info =
                         USB_QUIRK_LINEAR_FRAME_INTR_BINTERVAL },
  
+       /* Sound Devices USBPre2 */
+       { USB_DEVICE(0x0926, 0x0202), .driver_info =
+                       USB_QUIRK_ENDPOINT_BLACKLIST },
+
         /* Keytouch QWERTY Panel keyboard */
         { USB_DEVICE(0x0926, 0x3333), .driver_info =
                         USB_QUIRK_CONFIG_INTF_STRINGS },
@@ -445,6 +449,9 @@ static const struct usb_device_id usb_quirk_list[] = {
         /* INTEL VALUE SSD */
         { USB_DEVICE(0x8086, 0xf1a5), .driver_info = USB_QUIRK_RESET_RESUME },
  
+       /* novation SoundControl XL */
+       { USB_DEVICE(0x1235, 0x0061), .driver_info = USB_QUIRK_RESET_RESUME },
+
         { }  /* terminating entry must be last */
  };
  
@@ -472,6 +479,39 @@ static const struct usb_device_id usb_amd_resume_quirk_list[] = {
         { }  /* terminating entry must be last */
  };
  
+/*
+ * Entries for blacklisted endpoints that should be ignored when parsing
+ * configuration descriptors.
+ *
+ * Matched for devices with USB_QUIRK_ENDPOINT_BLACKLIST.
+ */
+static const struct usb_device_id usb_endpoint_blacklist[] = {
+       { USB_DEVICE_INTERFACE_NUMBER(0x0926, 0x0202, 1), .driver_info = 0x85 },
+       { }
+};
+
+bool usb_endpoint_is_blacklisted(struct usb_device *udev,
+               struct usb_host_interface *intf,
+               struct usb_endpoint_descriptor *epd)
+{
+       const struct usb_device_id *id;
+       unsigned int address;
+
+       for (id = usb_endpoint_blacklist; id->match_flags; ++id) {
+               if (!usb_match_device(udev, id))
+                       continue;
+
+               if (!usb_match_one_id_intf(udev, intf, id))
+                       continue;
+
+               address = id->driver_info;
+               if (address == epd->bEndpointAddress)
+                       return true;
+       }
+
+       return false;
+}
+
  static bool usb_match_any_interface(struct usb_device *udev,
                                     const struct usb_device_id *id)
  {
diff --git a/drivers/usb/core/usb.h b/drivers/usb/core/usb.h

index cf4783cf661a86ab521155b125961ce01e3fbf2d..3ad0ee57e859fb00990bcf733112c553b25c65b1 100644 (file)
--- a/drivers/usb/core/usb.h
+++ b/drivers/usb/core/usb.h
@@ -37,6 +37,9 @@ extern void usb_authorize_interface(struct usb_interface *);
  extern void usb_detect_quirks(struct usb_device *udev);
  extern void usb_detect_interface_quirks(struct usb_device *udev);
  extern void usb_release_quirk_list(void);
+extern bool usb_endpoint_is_blacklisted(struct usb_device *udev,
+               struct usb_host_interface *intf,
+               struct usb_endpoint_descriptor *epd);
  extern int usb_remove_device(struct usb_device *udev);
  
  extern int usb_get_device_descriptor(struct usb_device *dev,
diff --git a/drivers/usb/dwc2/gadget.c b/drivers/usb/dwc2/gadget.c

index 88f7d6d4ff2db1f81d4429cd1343e9e93c540903..92ed32ec160769783a14729a9aa31575057c22f3 100644 (file)
--- a/drivers/usb/dwc2/gadget.c
+++ b/drivers/usb/dwc2/gadget.c
@@ -1083,11 +1083,6 @@ static void dwc2_hsotg_start_req(struct dwc2_hsotg *hsotg,
         else
                 packets = 1;    /* send one packet if length is zero. */
  
-       if (hs_ep->isochronous && length > (hs_ep->mc * hs_ep->ep.maxpacket)) {
-               dev_err(hsotg->dev, "req length > maxpacket*mc\n");
-               return;
-       }
-
         if (dir_in && index != 0)
                 if (hs_ep->isochronous)
                         epsize = DXEPTSIZ_MC(packets);
@@ -1391,6 +1386,13 @@ static int dwc2_hsotg_ep_queue(struct usb_ep *ep, struct usb_request *req,
         req->actual = 0;
         req->status = -EINPROGRESS;
  
+       /* Don't queue ISOC request if length greater than mps*mc */
+       if (hs_ep->isochronous &&
+           req->length > (hs_ep->mc * hs_ep->ep.maxpacket)) {
+               dev_err(hs->dev, "req length > maxpacket*mc\n");
+               return -EINVAL;
+       }
+
         /* In DDMA mode for ISOC's don't queue request if length greater
          * than descriptor limits.
          */
@@ -1632,6 +1634,7 @@ static int dwc2_hsotg_process_req_status(struct dwc2_hsotg *hsotg,
         struct dwc2_hsotg_ep *ep0 = hsotg->eps_out[0];
         struct dwc2_hsotg_ep *ep;
         __le16 reply;
+       u16 status;
         int ret;
  
         dev_dbg(hsotg->dev, "%s: USB_REQ_GET_STATUS\n", __func__);
@@ -1643,11 +1646,10 @@ static int dwc2_hsotg_process_req_status(struct dwc2_hsotg *hsotg,
  
         switch (ctrl->bRequestType & USB_RECIP_MASK) {
         case USB_RECIP_DEVICE:
-               /*
-                * bit 0 => self powered
-                * bit 1 => remote wakeup
-                */
-               reply = cpu_to_le16(0);
+               status = 1 << USB_DEVICE_SELF_POWERED;
+               status |= hsotg->remote_wakeup_allowed <<
+                         USB_DEVICE_REMOTE_WAKEUP;
+               reply = cpu_to_le16(status);
                 break;
  
         case USB_RECIP_INTERFACE:
@@ -1758,7 +1760,10 @@ static int dwc2_hsotg_process_req_feature(struct dwc2_hsotg *hsotg,
         case USB_RECIP_DEVICE:
                 switch (wValue) {
                 case USB_DEVICE_REMOTE_WAKEUP:
-                       hsotg->remote_wakeup_allowed = 1;
+                       if (set)
+                               hsotg->remote_wakeup_allowed = 1;
+                       else
+                               hsotg->remote_wakeup_allowed = 0;
                         break;
  
                 case USB_DEVICE_TEST_MODE:
@@ -1768,16 +1773,17 @@ static int dwc2_hsotg_process_req_feature(struct dwc2_hsotg *hsotg,
                                 return -EINVAL;
  
                         hsotg->test_mode = wIndex >> 8;
-                       ret = dwc2_hsotg_send_reply(hsotg, ep0, NULL, 0);
-                       if (ret) {
-                               dev_err(hsotg->dev,
-                                       "%s: failed to send reply\n", __func__);
-                               return ret;
-                       }
                         break;
                 default:
                         return -ENOENT;
                 }
+
+               ret = dwc2_hsotg_send_reply(hsotg, ep0, NULL, 0);
+               if (ret) {
+                       dev_err(hsotg->dev,
+                               "%s: failed to send reply\n", __func__);
+                       return ret;
+               }
                 break;
  
         case USB_RECIP_ENDPOINT:
diff --git a/drivers/usb/dwc3/debug.h b/drivers/usb/dwc3/debug.h

index e56beb9d1e36c83f002052ed8ef85f75786d0b28..4a13ceaf40935abfa909495050ef37df0b40f46b 100644 (file)
--- a/drivers/usb/dwc3/debug.h
+++ b/drivers/usb/dwc3/debug.h
@@ -256,86 +256,77 @@ static inline const char *dwc3_ep_event_string(char *str, size_t size,
         u8 epnum = event->endpoint_number;
         size_t len;
         int status;
-       int ret;
  
-       ret = snprintf(str, size, "ep%d%s: ", epnum >> 1,
+       len = scnprintf(str, size, "ep%d%s: ", epnum >> 1,
                         (epnum & 1) ? "in" : "out");
-       if (ret < 0)
-               return "UNKNOWN";
  
         status = event->status;
  
         switch (event->endpoint_event) {
         case DWC3_DEPEVT_XFERCOMPLETE:
-               len = strlen(str);
-               snprintf(str + len, size - len, "Transfer Complete (%c%c%c)",
+               len += scnprintf(str + len, size - len,
+                               "Transfer Complete (%c%c%c)",
                                 status & DEPEVT_STATUS_SHORT ? 'S' : 's',
                                 status & DEPEVT_STATUS_IOC ? 'I' : 'i',
                                 status & DEPEVT_STATUS_LST ? 'L' : 'l');
  
-               len = strlen(str);
-
                 if (epnum <= 1)
-                       snprintf(str + len, size - len, " [%s]",
+                       scnprintf(str + len, size - len, " [%s]",
                                         dwc3_ep0_state_string(ep0state));
                 break;
         case DWC3_DEPEVT_XFERINPROGRESS:
-               len = strlen(str);
-
-               snprintf(str + len, size - len, "Transfer In Progress [%d] (%c%c%c)",
+               scnprintf(str + len, size - len,
+                               "Transfer In Progress [%d] (%c%c%c)",
                                 event->parameters,
                                 status & DEPEVT_STATUS_SHORT ? 'S' : 's',
                                 status & DEPEVT_STATUS_IOC ? 'I' : 'i',
                                 status & DEPEVT_STATUS_LST ? 'M' : 'm');
                 break;
         case DWC3_DEPEVT_XFERNOTREADY:
-               len = strlen(str);
-
-               snprintf(str + len, size - len, "Transfer Not Ready [%d]%s",
+               len += scnprintf(str + len, size - len,
+                               "Transfer Not Ready [%d]%s",
                                 event->parameters,
                                 status & DEPEVT_STATUS_TRANSFER_ACTIVE ?
                                 " (Active)" : " (Not Active)");
  
-               len = strlen(str);
-
                 /* Control Endpoints */
                 if (epnum <= 1) {
                         int phase = DEPEVT_STATUS_CONTROL_PHASE(event->status);
  
                         switch (phase) {
                         case DEPEVT_STATUS_CONTROL_DATA:
-                               snprintf(str + ret, size - ret,
+                               scnprintf(str + len, size - len,
                                                 " [Data Phase]");
                                 break;
                         case DEPEVT_STATUS_CONTROL_STATUS:
-                               snprintf(str + ret, size - ret,
+                               scnprintf(str + len, size - len,
                                                 " [Status Phase]");
                         }
                 }
                 break;
         case DWC3_DEPEVT_RXTXFIFOEVT:
-               snprintf(str + ret, size - ret, "FIFO");
+               scnprintf(str + len, size - len, "FIFO");
                 break;
         case DWC3_DEPEVT_STREAMEVT:
                 status = event->status;
  
                 switch (status) {
                 case DEPEVT_STREAMEVT_FOUND:
-                       snprintf(str + ret, size - ret, " Stream %d Found",
+                       scnprintf(str + len, size - len, " Stream %d Found",
                                         event->parameters);
                         break;
                 case DEPEVT_STREAMEVT_NOTFOUND:
                 default:
-                       snprintf(str + ret, size - ret, " Stream Not Found");
+                       scnprintf(str + len, size - len, " Stream Not Found");
                         break;
                 }
  
                 break;
         case DWC3_DEPEVT_EPCMDCMPLT:
-               snprintf(str + ret, size - ret, "Endpoint Command Complete");
+               scnprintf(str + len, size - len, "Endpoint Command Complete");
                 break;
         default:
-               snprintf(str, size, "UNKNOWN");
+               scnprintf(str + len, size - len, "UNKNOWN");
         }
  
         return str;
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c

index 1b8014ab0b25098a3d976c98cdf036646aac885f..1b7d2f9cb673a3f440677e309f1be827251693f7 100644 (file)
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -2429,7 +2429,8 @@ static int dwc3_gadget_ep_reclaim_completed_trb(struct dwc3_ep *dep,
         if (event->status & DEPEVT_STATUS_SHORT && !chain)
                 return 1;
  
-       if (event->status & DEPEVT_STATUS_IOC)
+       if ((trb->ctrl & DWC3_TRB_CTRL_IOC) ||
+           (trb->ctrl & DWC3_TRB_CTRL_LST))
                 return 1;
  
         return 0;
diff --git a/drivers/usb/gadget/composite.c b/drivers/usb/gadget/composite.c

index 3b4f67000315c952073068552669f2d10cc87d5d..223f72d4d9eddf2734a4148ce21cd52a358b4102 100644 (file)
--- a/drivers/usb/gadget/composite.c
+++ b/drivers/usb/gadget/composite.c
@@ -437,12 +437,14 @@ static u8 encode_bMaxPower(enum usb_device_speed speed,
                 val = CONFIG_USB_GADGET_VBUS_DRAW;
         if (!val)
                 return 0;
-       switch (speed) {
-       case USB_SPEED_SUPER:
-               return DIV_ROUND_UP(val, 8);
-       default:
-               return DIV_ROUND_UP(val, 2);
-       }
+       if (speed < USB_SPEED_SUPER)
+               return min(val, 500U) / 2;
+       else
+               /*
+                * USB 3.x supports up to 900mA, but since 900 isn't divisible
+                * by 8 the integral division will effectively cap to 896mA.
+                */
+               return min(val, 900U) / 8;
  }
  
  static int config_buf(struct usb_configuration *config,
@@ -854,6 +856,10 @@ static int set_config(struct usb_composite_dev *cdev,
  
         /* when we return, be sure our power usage is valid */
         power = c->MaxPower ? c->MaxPower : CONFIG_USB_GADGET_VBUS_DRAW;
+       if (gadget->speed < USB_SPEED_SUPER)
+               power = min(power, 500U);
+       else
+               power = min(power, 900U);
  done:
         usb_gadget_vbus_draw(gadget, power);
         if (result >= 0 && cdev->delayed_status)
@@ -2280,7 +2286,7 @@ void composite_resume(struct usb_gadget *gadget)
  {
         struct usb_composite_dev        *cdev = get_gadget_data(gadget);
         struct usb_function             *f;
-       u16                             maxpower;
+       unsigned                        maxpower;
  
         /* REVISIT:  should we have config level
          * suspend/resume callbacks?
@@ -2294,10 +2300,14 @@ void composite_resume(struct usb_gadget *gadget)
                                 f->resume(f);
                 }
  
-               maxpower = cdev->config->MaxPower;
+               maxpower = cdev->config->MaxPower ?
+                       cdev->config->MaxPower : CONFIG_USB_GADGET_VBUS_DRAW;
+               if (gadget->speed < USB_SPEED_SUPER)
+                       maxpower = min(maxpower, 500U);
+               else
+                       maxpower = min(maxpower, 900U);
  
-               usb_gadget_vbus_draw(gadget, maxpower ?
-                       maxpower : CONFIG_USB_GADGET_VBUS_DRAW);
+               usb_gadget_vbus_draw(gadget, maxpower);
         }
  
         cdev->suspended = 0;
diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c

index 6171d28331e6d7999e5afa3a2b99e42ea1ec5168..571917677d358f4f62dbf34ee7a17a9dce0755f4 100644 (file)
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -1162,18 +1162,19 @@ static int ffs_aio_cancel(struct kiocb *kiocb)
  {
         struct ffs_io_data *io_data = kiocb->private;
         struct ffs_epfile *epfile = kiocb->ki_filp->private_data;
+       unsigned long flags;
         int value;
  
         ENTER();
  
-       spin_lock_irq(&epfile->ffs->eps_lock);
+       spin_lock_irqsave(&epfile->ffs->eps_lock, flags);
  
         if (likely(io_data && io_data->ep && io_data->req))
                 value = usb_ep_dequeue(io_data->ep, io_data->req);
         else
                 value = -EINVAL;
  
-       spin_unlock_irq(&epfile->ffs->eps_lock);
+       spin_unlock_irqrestore(&epfile->ffs->eps_lock, flags);
  
         return value;
  }
diff --git a/drivers/usb/gadget/function/u_audio.c b/drivers/usb/gadget/function/u_audio.c

index 6d956f190f5ac7322cbcf8a26d444aa13e1117b5..e6d32c536781247908904b7e9252901e064fad0b 100644 (file)
--- a/drivers/usb/gadget/function/u_audio.c
+++ b/drivers/usb/gadget/function/u_audio.c
@@ -361,7 +361,7 @@ int u_audio_start_capture(struct g_audio *audio_dev)
         ep = audio_dev->out_ep;
         prm = &uac->c_prm;
         config_ep_by_speed(gadget, &audio_dev->func, ep);
-       req_len = prm->max_psize;
+       req_len = ep->maxpacket;
  
         prm->ep_enabled = true;
         usb_ep_enable(ep);
@@ -379,7 +379,7 @@ int u_audio_start_capture(struct g_audio *audio_dev)
                         req->context = &prm->ureq[i];
                         req->length = req_len;
                         req->complete = u_audio_iso_complete;
-                       req->buf = prm->rbuf + i * prm->max_psize;
+                       req->buf = prm->rbuf + i * ep->maxpacket;
                 }
  
                 if (usb_ep_queue(ep, prm->ureq[i].req, GFP_ATOMIC))
@@ -430,9 +430,9 @@ int u_audio_start_playback(struct g_audio *audio_dev)
         uac->p_pktsize = min_t(unsigned int,
                                 uac->p_framesize *
                                         (params->p_srate / uac->p_interval),
-                               prm->max_psize);
+                               ep->maxpacket);
  
-       if (uac->p_pktsize < prm->max_psize)
+       if (uac->p_pktsize < ep->maxpacket)
                 uac->p_pktsize_residue = uac->p_framesize *
                         (params->p_srate % uac->p_interval);
         else
@@ -457,7 +457,7 @@ int u_audio_start_playback(struct g_audio *audio_dev)
                         req->context = &prm->ureq[i];
                         req->length = req_len;
                         req->complete = u_audio_iso_complete;
-                       req->buf = prm->rbuf + i * prm->max_psize;
+                       req->buf = prm->rbuf + i * ep->maxpacket;
                 }
  
                 if (usb_ep_queue(ep, prm->ureq[i].req, GFP_ATOMIC))
diff --git a/drivers/usb/gadget/function/u_serial.c b/drivers/usb/gadget/function/u_serial.c

index f986e5c559748d43fb598753a44ffaf46e0a081c..8167d379e115ba5ae7874478f863d0b16c84f4a4 100644 (file)
--- a/drivers/usb/gadget/function/u_serial.c
+++ b/drivers/usb/gadget/function/u_serial.c
@@ -561,8 +561,10 @@ static int gs_start_io(struct gs_port *port)
         port->n_read = 0;
         started = gs_start_rx(port);
  
-       /* unblock any pending writes into our circular buffer */
         if (started) {
+               gs_start_tx(port);
+               /* Unblock any pending writes into our circular buffer, in case
+                * we didn't in gs_start_tx() */
                 tty_wakeup(port->port.tty);
         } else {
                 gs_free_requests(ep, head, &port->read_allocated);
diff --git a/drivers/usb/gadget/udc/udc-xilinx.c b/drivers/usb/gadget/udc/udc-xilinx.c

index 29d8e5f8bb58397ec6402f22cf723e5774f6f132..b1cfc8279c3d2d00c3ee594011124e8dd46735ba 100644 (file)
--- a/drivers/usb/gadget/udc/udc-xilinx.c
+++ b/drivers/usb/gadget/udc/udc-xilinx.c
@@ -1399,7 +1399,6 @@ err:
  /**
   * xudc_stop - stops the device.
   * @gadget: pointer to the usb gadget structure
- * @driver: pointer to usb gadget driver structure
   *
   * Return: zero always
   */
diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c

index 7a3a29e5e9d29d33bec4e9ff45e50f26986dbf96..af92b2576fe91c5d497ab0aa8145aeaabd9ab5c9 100644 (file)
--- a/drivers/usb/host/xhci-hub.c
+++ b/drivers/usb/host/xhci-hub.c
@@ -55,6 +55,7 @@ static u8 usb_bos_descriptor [] = {
  static int xhci_create_usb3_bos_desc(struct xhci_hcd *xhci, char *buf,
                                      u16 wLength)
  {
+       struct xhci_port_cap *port_cap = NULL;
         int i, ssa_count;
         u32 temp;
         u16 desc_size, ssp_cap_size, ssa_size = 0;
@@ -64,16 +65,24 @@ static int xhci_create_usb3_bos_desc(struct xhci_hcd *xhci, char *buf,
         ssp_cap_size = sizeof(usb_bos_descriptor) - desc_size;
  
         /* does xhci support USB 3.1 Enhanced SuperSpeed */
-       if (xhci->usb3_rhub.min_rev >= 0x01) {
+       for (i = 0; i < xhci->num_port_caps; i++) {
+               if (xhci->port_caps[i].maj_rev == 0x03 &&
+                   xhci->port_caps[i].min_rev >= 0x01) {
+                       usb3_1 = true;
+                       port_cap = &xhci->port_caps[i];
+                       break;
+               }
+       }
+
+       if (usb3_1) {
                 /* does xhci provide a PSI table for SSA speed attributes? */
-               if (xhci->usb3_rhub.psi_count) {
+               if (port_cap->psi_count) {
                         /* two SSA entries for each unique PSI ID, RX and TX */
-                       ssa_count = xhci->usb3_rhub.psi_uid_count * 2;
+                       ssa_count = port_cap->psi_uid_count * 2;
                         ssa_size = ssa_count * sizeof(u32);
                         ssp_cap_size -= 16; /* skip copying the default SSA */
                 }
                 desc_size += ssp_cap_size;
-               usb3_1 = true;
         }
         memcpy(buf, &usb_bos_descriptor, min(desc_size, wLength));
  
@@ -99,7 +108,7 @@ static int xhci_create_usb3_bos_desc(struct xhci_hcd *xhci, char *buf,
         }
  
         /* If PSI table exists, add the custom speed attributes from it */
-       if (usb3_1 && xhci->usb3_rhub.psi_count) {
+       if (usb3_1 && port_cap->psi_count) {
                 u32 ssp_cap_base, bm_attrib, psi, psi_mant, psi_exp;
                 int offset;
  
@@ -111,7 +120,7 @@ static int xhci_create_usb3_bos_desc(struct xhci_hcd *xhci, char *buf,
  
                 /* attribute count SSAC bits 4:0 and ID count SSIC bits 8:5 */
                 bm_attrib = (ssa_count - 1) & 0x1f;
-               bm_attrib |= (xhci->usb3_rhub.psi_uid_count - 1) << 5;
+               bm_attrib |= (port_cap->psi_uid_count - 1) << 5;
                 put_unaligned_le32(bm_attrib, &buf[ssp_cap_base + 4]);
  
                 if (wLength < desc_size + ssa_size)
@@ -124,8 +133,8 @@ static int xhci_create_usb3_bos_desc(struct xhci_hcd *xhci, char *buf,
                  * USB 3.1 requires two SSA entries (RX and TX) for every link
                  */
                 offset = desc_size;
-               for (i = 0; i < xhci->usb3_rhub.psi_count; i++) {
-                       psi = xhci->usb3_rhub.psi[i];
+               for (i = 0; i < port_cap->psi_count; i++) {
+                       psi = port_cap->psi[i];
                         psi &= ~USB_SSP_SUBLINK_SPEED_RSVD;
                         psi_exp = XHCI_EXT_PORT_PSIE(psi);
                         psi_mant = XHCI_EXT_PORT_PSIM(psi);
diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c

index 3b1388fa2f36e74093e2dc2abc0083cd650266c5..884c601bfa15f8d809a44bab42f11809a9c1e93c 100644 (file)
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -1475,9 +1475,15 @@ int xhci_endpoint_init(struct xhci_hcd *xhci,
         /* Allow 3 retries for everything but isoc, set CErr = 3 */
         if (!usb_endpoint_xfer_isoc(&ep->desc))
                 err_count = 3;
-       /* Some devices get this wrong */
-       if (usb_endpoint_xfer_bulk(&ep->desc) && udev->speed == USB_SPEED_HIGH)
-               max_packet = 512;
+       /* HS bulk max packet should be 512, FS bulk supports 8, 16, 32 or 64 */
+       if (usb_endpoint_xfer_bulk(&ep->desc)) {
+               if (udev->speed == USB_SPEED_HIGH)
+                       max_packet = 512;
+               if (udev->speed == USB_SPEED_FULL) {
+                       max_packet = rounddown_pow_of_two(max_packet);
+                       max_packet = clamp_val(max_packet, 8, 64);
+               }
+       }
         /* xHCI 1.0 and 1.1 indicates that ctrl ep avg TRB Length should be 8 */
         if (usb_endpoint_xfer_control(&ep->desc) && xhci->hci_version >= 0x100)
                 avg_trb_len = 8;
@@ -1909,17 +1915,17 @@ no_bw:
         xhci->usb3_rhub.num_ports = 0;
         xhci->num_active_eps = 0;
         kfree(xhci->usb2_rhub.ports);
-       kfree(xhci->usb2_rhub.psi);
         kfree(xhci->usb3_rhub.ports);
-       kfree(xhci->usb3_rhub.psi);
         kfree(xhci->hw_ports);
         kfree(xhci->rh_bw);
         kfree(xhci->ext_caps);
+       for (i = 0; i < xhci->num_port_caps; i++)
+               kfree(xhci->port_caps[i].psi);
+       kfree(xhci->port_caps);
+       xhci->num_port_caps = 0;
  
         xhci->usb2_rhub.ports = NULL;
-       xhci->usb2_rhub.psi = NULL;
         xhci->usb3_rhub.ports = NULL;
-       xhci->usb3_rhub.psi = NULL;
         xhci->hw_ports = NULL;
         xhci->rh_bw = NULL;
         xhci->ext_caps = NULL;
@@ -2120,6 +2126,7 @@ static void xhci_add_in_port(struct xhci_hcd *xhci, unsigned int num_ports,
         u8 major_revision, minor_revision;
         struct xhci_hub *rhub;
         struct device *dev = xhci_to_hcd(xhci)->self.sysdev;
+       struct xhci_port_cap *port_cap;
  
         temp = readl(addr);
         major_revision = XHCI_EXT_PORT_MAJOR(temp);
@@ -2154,31 +2161,39 @@ static void xhci_add_in_port(struct xhci_hcd *xhci, unsigned int num_ports,
                 /* WTF? "Valid values are ‘1’ to MaxPorts" */
                 return;
  
-       rhub->psi_count = XHCI_EXT_PORT_PSIC(temp);
-       if (rhub->psi_count) {
-               rhub->psi = kcalloc_node(rhub->psi_count, sizeof(*rhub->psi),
-                                   GFP_KERNEL, dev_to_node(dev));
-               if (!rhub->psi)
-                       rhub->psi_count = 0;
+       port_cap = &xhci->port_caps[xhci->num_port_caps++];
+       if (xhci->num_port_caps > max_caps)
+               return;
+
+       port_cap->maj_rev = major_revision;
+       port_cap->min_rev = minor_revision;
+       port_cap->psi_count = XHCI_EXT_PORT_PSIC(temp);
  
-               rhub->psi_uid_count++;
-               for (i = 0; i < rhub->psi_count; i++) {
-                       rhub->psi[i] = readl(addr + 4 + i);
+       if (port_cap->psi_count) {
+               port_cap->psi = kcalloc_node(port_cap->psi_count,
+                                            sizeof(*port_cap->psi),
+                                            GFP_KERNEL, dev_to_node(dev));
+               if (!port_cap->psi)
+                       port_cap->psi_count = 0;
+
+               port_cap->psi_uid_count++;
+               for (i = 0; i < port_cap->psi_count; i++) {
+                       port_cap->psi[i] = readl(addr + 4 + i);
  
                         /* count unique ID values, two consecutive entries can
                          * have the same ID if link is assymetric
                          */
-                       if (i && (XHCI_EXT_PORT_PSIV(rhub->psi[i]) !=
-                                 XHCI_EXT_PORT_PSIV(rhub->psi[i - 1])))
-                               rhub->psi_uid_count++;
+                       if (i && (XHCI_EXT_PORT_PSIV(port_cap->psi[i]) !=
+                                 XHCI_EXT_PORT_PSIV(port_cap->psi[i - 1])))
+                               port_cap->psi_uid_count++;
  
                         xhci_dbg(xhci, "PSIV:%d PSIE:%d PLT:%d PFD:%d LP:%d PSIM:%d\n",
-                                 XHCI_EXT_PORT_PSIV(rhub->psi[i]),
-                                 XHCI_EXT_PORT_PSIE(rhub->psi[i]),
-                                 XHCI_EXT_PORT_PLT(rhub->psi[i]),
-                                 XHCI_EXT_PORT_PFD(rhub->psi[i]),
-                                 XHCI_EXT_PORT_LP(rhub->psi[i]),
-                                 XHCI_EXT_PORT_PSIM(rhub->psi[i]));
+                                 XHCI_EXT_PORT_PSIV(port_cap->psi[i]),
+                                 XHCI_EXT_PORT_PSIE(port_cap->psi[i]),
+                                 XHCI_EXT_PORT_PLT(port_cap->psi[i]),
+                                 XHCI_EXT_PORT_PFD(port_cap->psi[i]),
+                                 XHCI_EXT_PORT_LP(port_cap->psi[i]),
+                                 XHCI_EXT_PORT_PSIM(port_cap->psi[i]));
                 }
         }
         /* cache usb2 port capabilities */
@@ -2213,6 +2228,7 @@ static void xhci_add_in_port(struct xhci_hcd *xhci, unsigned int num_ports,
                         continue;
                 }
                 hw_port->rhub = rhub;
+               hw_port->port_cap = port_cap;
                 rhub->num_ports++;
         }
         /* FIXME: Should we disable ports not in the Extended Capabilities? */
@@ -2303,6 +2319,11 @@ static int xhci_setup_port_arrays(struct xhci_hcd *xhci, gfp_t flags)
         if (!xhci->ext_caps)
                 return -ENOMEM;
  
+       xhci->port_caps = kcalloc_node(cap_count, sizeof(*xhci->port_caps),
+                               flags, dev_to_node(dev));
+       if (!xhci->port_caps)
+               return -ENOMEM;
+
         offset = cap_start;
  
         while (offset) {
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c

index 4917c5b033faccd0f6b9fe903dbf99cb8a47d62c..5e9b537df631bbc5bd16674674e67491e02af7bb 100644 (file)
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -49,6 +49,7 @@
  #define PCI_DEVICE_ID_INTEL_TITAN_RIDGE_4C_XHCI                0x15ec
  #define PCI_DEVICE_ID_INTEL_TITAN_RIDGE_DD_XHCI                0x15f0
  #define PCI_DEVICE_ID_INTEL_ICE_LAKE_XHCI              0x8a13
+#define PCI_DEVICE_ID_INTEL_CML_XHCI                   0xa3af
  
  #define PCI_DEVICE_ID_AMD_PROMONTORYA_4                        0x43b9
  #define PCI_DEVICE_ID_AMD_PROMONTORYA_3                        0x43ba
@@ -187,7 +188,8 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
                  pdev->device == PCI_DEVICE_ID_INTEL_BROXTON_M_XHCI ||
                  pdev->device == PCI_DEVICE_ID_INTEL_BROXTON_B_XHCI ||
                  pdev->device == PCI_DEVICE_ID_INTEL_APL_XHCI ||
-                pdev->device == PCI_DEVICE_ID_INTEL_DNV_XHCI)) {
+                pdev->device == PCI_DEVICE_ID_INTEL_DNV_XHCI ||
+                pdev->device == PCI_DEVICE_ID_INTEL_CML_XHCI)) {
                 xhci->quirks |= XHCI_PME_STUCK_QUIRK;
         }
         if (pdev->vendor == PCI_VENDOR_ID_INTEL &&
@@ -302,6 +304,9 @@ static int xhci_pci_setup(struct usb_hcd *hcd)
         if (!usb_hcd_is_primary_hcd(hcd))
                 return 0;
  
+       if (xhci->quirks & XHCI_PME_STUCK_QUIRK)
+               xhci_pme_acpi_rtd3_enable(pdev);
+
         xhci_dbg(xhci, "Got SBRN %u\n", (unsigned int) xhci->sbrn);
  
         /* Find any debug ports */
@@ -359,9 +364,6 @@ static int xhci_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
                         HCC_MAX_PSA(xhci->hcc_params) >= 4)
                 xhci->shared_hcd->can_do_streams = 1;
  
-       if (xhci->quirks & XHCI_PME_STUCK_QUIRK)
-               xhci_pme_acpi_rtd3_enable(dev);
-
         /* USB-2 and USB-3 roothubs initialized, allow runtime pm suspend */
         pm_runtime_put_noidle(&dev->dev);
  
diff --git a/drivers/usb/host/xhci.h b/drivers/usb/host/xhci.h

index 13d8838cd552be01b145e0d2f78dc815c3ffad0b..3ecee10fdcdc7dfd697b24310de0ad3885421f10 100644 (file)
--- a/drivers/usb/host/xhci.h
+++ b/drivers/usb/host/xhci.h
@@ -1702,12 +1702,20 @@ struct xhci_bus_state {
   * Intel Lynx Point LP xHCI host.
   */
  #define        XHCI_MAX_REXIT_TIMEOUT_MS       20
+struct xhci_port_cap {
+       u32                     *psi;   /* array of protocol speed ID entries */
+       u8                      psi_count;
+       u8                      psi_uid_count;
+       u8                      maj_rev;
+       u8                      min_rev;
+};
  
  struct xhci_port {
         __le32 __iomem          *addr;
         int                     hw_portnum;
         int                     hcd_portnum;
         struct xhci_hub         *rhub;
+       struct xhci_port_cap    *port_cap;
  };
  
  struct xhci_hub {
@@ -1719,9 +1727,6 @@ struct xhci_hub {
         /* supported prococol extended capabiliy values */
         u8                      maj_rev;
         u8                      min_rev;
-       u32                     *psi;   /* array of protocol speed ID entries */
-       u8                      psi_count;
-       u8                      psi_uid_count;
  };
  
  /* There is one xhci_hcd structure per controller */
@@ -1880,6 +1885,9 @@ struct xhci_hcd {
         /* cached usb2 extened protocol capabilites */
         u32                     *ext_caps;
         unsigned int            num_ext_caps;
+       /* cached extended protocol port capabilities */
+       struct xhci_port_cap    *port_caps;
+       unsigned int            num_port_caps;
         /* Compliance Mode Recovery Data */
         struct timer_list       comp_mode_recovery_timer;
         u32                     port_status_u0;
diff --git a/drivers/usb/misc/iowarrior.c b/drivers/usb/misc/iowarrior.c

index dce44fbf031fb2875498632dba484fe3866ad045..dce20301e367a617277d1ae0f2489f220c4f9ee4 100644 (file)
--- a/drivers/usb/misc/iowarrior.c
+++ b/drivers/usb/misc/iowarrior.c
@@ -33,6 +33,14 @@
  #define USB_DEVICE_ID_CODEMERCS_IOWPV2 0x1512
  /* full speed iowarrior */
  #define USB_DEVICE_ID_CODEMERCS_IOW56  0x1503
+/* fuller speed iowarrior */
+#define USB_DEVICE_ID_CODEMERCS_IOW28  0x1504
+#define USB_DEVICE_ID_CODEMERCS_IOW28L 0x1505
+#define USB_DEVICE_ID_CODEMERCS_IOW100 0x1506
+
+/* OEMed devices */
+#define USB_DEVICE_ID_CODEMERCS_IOW24SAG       0x158a
+#define USB_DEVICE_ID_CODEMERCS_IOW56AM                0x158b
  
  /* Get a minor range for your devices from the usb maintainer */
  #ifdef CONFIG_USB_DYNAMIC_MINORS
@@ -133,6 +141,11 @@ static const struct usb_device_id iowarrior_ids[] = {
         {USB_DEVICE(USB_VENDOR_ID_CODEMERCS, USB_DEVICE_ID_CODEMERCS_IOWPV1)},
         {USB_DEVICE(USB_VENDOR_ID_CODEMERCS, USB_DEVICE_ID_CODEMERCS_IOWPV2)},
         {USB_DEVICE(USB_VENDOR_ID_CODEMERCS, USB_DEVICE_ID_CODEMERCS_IOW56)},
+       {USB_DEVICE(USB_VENDOR_ID_CODEMERCS, USB_DEVICE_ID_CODEMERCS_IOW24SAG)},
+       {USB_DEVICE(USB_VENDOR_ID_CODEMERCS, USB_DEVICE_ID_CODEMERCS_IOW56AM)},
+       {USB_DEVICE(USB_VENDOR_ID_CODEMERCS, USB_DEVICE_ID_CODEMERCS_IOW28)},
+       {USB_DEVICE(USB_VENDOR_ID_CODEMERCS, USB_DEVICE_ID_CODEMERCS_IOW28L)},
+       {USB_DEVICE(USB_VENDOR_ID_CODEMERCS, USB_DEVICE_ID_CODEMERCS_IOW100)},
         {}                      /* Terminating entry */
  };
  MODULE_DEVICE_TABLE(usb, iowarrior_ids);
@@ -357,6 +370,7 @@ static ssize_t iowarrior_write(struct file *file,
         }
         switch (dev->product_id) {
         case USB_DEVICE_ID_CODEMERCS_IOW24:
+       case USB_DEVICE_ID_CODEMERCS_IOW24SAG:
         case USB_DEVICE_ID_CODEMERCS_IOWPV1:
         case USB_DEVICE_ID_CODEMERCS_IOWPV2:
         case USB_DEVICE_ID_CODEMERCS_IOW40:
@@ -371,6 +385,10 @@ static ssize_t iowarrior_write(struct file *file,
                 goto exit;
                 break;
         case USB_DEVICE_ID_CODEMERCS_IOW56:
+       case USB_DEVICE_ID_CODEMERCS_IOW56AM:
+       case USB_DEVICE_ID_CODEMERCS_IOW28:
+       case USB_DEVICE_ID_CODEMERCS_IOW28L:
+       case USB_DEVICE_ID_CODEMERCS_IOW100:
                 /* The IOW56 uses asynchronous IO and more urbs */
                 if (atomic_read(&dev->write_busy) == MAX_WRITES_IN_FLIGHT) {
                         /* Wait until we are below the limit for submitted urbs */
@@ -493,6 +511,7 @@ static long iowarrior_ioctl(struct file *file, unsigned int cmd,
         switch (cmd) {
         case IOW_WRITE:
                 if (dev->product_id == USB_DEVICE_ID_CODEMERCS_IOW24 ||
+                   dev->product_id == USB_DEVICE_ID_CODEMERCS_IOW24SAG ||
                     dev->product_id == USB_DEVICE_ID_CODEMERCS_IOWPV1 ||
                     dev->product_id == USB_DEVICE_ID_CODEMERCS_IOWPV2 ||
                     dev->product_id == USB_DEVICE_ID_CODEMERCS_IOW40) {
@@ -767,7 +786,11 @@ static int iowarrior_probe(struct usb_interface *interface,
                 goto error;
         }
  
-       if (dev->product_id == USB_DEVICE_ID_CODEMERCS_IOW56) {
+       if ((dev->product_id == USB_DEVICE_ID_CODEMERCS_IOW56) ||
+           (dev->product_id == USB_DEVICE_ID_CODEMERCS_IOW56AM) ||
+           (dev->product_id == USB_DEVICE_ID_CODEMERCS_IOW28) ||
+           (dev->product_id == USB_DEVICE_ID_CODEMERCS_IOW28L) ||
+           (dev->product_id == USB_DEVICE_ID_CODEMERCS_IOW100)) {
                 res = usb_find_last_int_out_endpoint(iface_desc,
                                 &dev->int_out_endpoint);
                 if (res) {
@@ -780,7 +803,11 @@ static int iowarrior_probe(struct usb_interface *interface,
         /* we have to check the report_size often, so remember it in the endianness suitable for our machine */
         dev->report_size = usb_endpoint_maxp(dev->int_in_endpoint);
         if ((dev->interface->cur_altsetting->desc.bInterfaceNumber == 0) &&
-           (dev->product_id == USB_DEVICE_ID_CODEMERCS_IOW56))
+           ((dev->product_id == USB_DEVICE_ID_CODEMERCS_IOW56) ||
+            (dev->product_id == USB_DEVICE_ID_CODEMERCS_IOW56AM) ||
+            (dev->product_id == USB_DEVICE_ID_CODEMERCS_IOW28) ||
+            (dev->product_id == USB_DEVICE_ID_CODEMERCS_IOW28L) ||
+            (dev->product_id == USB_DEVICE_ID_CODEMERCS_IOW100)))
                 /* IOWarrior56 has wMaxPacketSize different from report size */
                 dev->report_size = 7;
  
diff --git a/drivers/usb/phy/phy-tegra-usb.c b/drivers/usb/phy/phy-tegra-usb.c

index 037e8eee737d58d1fefee1f5c4d33ccdcced7376..6153cc35aba0d71cbedaff59f8ae40ae825f43c6 100644 (file)
--- a/drivers/usb/phy/phy-tegra-usb.c
+++ b/drivers/usb/phy/phy-tegra-usb.c
@@ -969,6 +969,10 @@ static int utmi_phy_probe(struct tegra_usb_phy *tegra_phy,
                 return  -ENXIO;
         }
  
+       /*
+        * Note that UTMI pad registers are shared by all PHYs, therefore
+        * devm_platform_ioremap_resource() can't be used here.
+        */
         tegra_phy->pad_regs = devm_ioremap(&pdev->dev, res->start,
                                            resource_size(res));
         if (!tegra_phy->pad_regs) {
@@ -1087,6 +1091,10 @@ static int tegra_usb_phy_probe(struct platform_device *pdev)
                 return  -ENXIO;
         }
  
+       /*
+        * Note that PHY and USB controller are using shared registers,
+        * therefore devm_platform_ioremap_resource() can't be used here.
+        */
         tegra_phy->regs = devm_ioremap(&pdev->dev, res->start,
                                        resource_size(res));
         if (!tegra_phy->regs) {
diff --git a/drivers/usb/serial/ch341.c b/drivers/usb/serial/ch341.c

index d3f420f3a083570d4d3372514c39430caa9fc8d8..c5ecdcd51ffc6742e09af4954e5c3c272233c208 100644 (file)
--- a/drivers/usb/serial/ch341.c
+++ b/drivers/usb/serial/ch341.c
@@ -205,6 +205,16 @@ static int ch341_get_divisor(speed_t speed)
                         16 * speed - 16 * CH341_CLKRATE / (clk_div * (div + 1)))
                 div++;
  
+       /*
+        * Prefer lower base clock (fact = 0) if even divisor.
+        *
+        * Note that this makes the receiver more tolerant to errors.
+        */
+       if (fact == 1 && div % 2 == 0) {
+               div /= 2;
+               fact = 0;
+       }
+
         return (0x100 - div) << 8 | fact << 2 | ps;
  }
  
diff --git a/drivers/usb/serial/ir-usb.c b/drivers/usb/serial/ir-usb.c

index 79d0586e2b338019b03ddc042d9479544784c60d..172261a908d8d6d94e76996ad28e7fbf3b2cd94a 100644 (file)
--- a/drivers/usb/serial/ir-usb.c
+++ b/drivers/usb/serial/ir-usb.c
@@ -448,7 +448,7 @@ static void ir_set_termios(struct tty_struct *tty,
                         usb_sndbulkpipe(udev, port->bulk_out_endpointAddress),
                         transfer_buffer, 1, &actual_length, 5000);
         if (ret || actual_length != 1) {
-               if (actual_length != 1)
+               if (!ret)
                         ret = -EIO;
                 dev_err(&port->dev, "failed to change line speed: %d\n", ret);
         }
diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c

index 95bba3ba6ac67ec933215f9548df254cc44081f9..3670fda02c3460000b06be6e0d496a3eb014fbb7 100644 (file)
--- a/drivers/usb/storage/uas.c
+++ b/drivers/usb/storage/uas.c
@@ -45,6 +45,7 @@ struct uas_dev_info {
         struct scsi_cmnd *cmnd[MAX_CMNDS];
         spinlock_t lock;
         struct work_struct work;
+       struct work_struct scan_work;      /* for async scanning */
  };
  
  enum {
@@ -114,6 +115,17 @@ out:
         spin_unlock_irqrestore(&devinfo->lock, flags);
  }
  
+static void uas_scan_work(struct work_struct *work)
+{
+       struct uas_dev_info *devinfo =
+               container_of(work, struct uas_dev_info, scan_work);
+       struct Scsi_Host *shost = usb_get_intfdata(devinfo->intf);
+
+       dev_dbg(&devinfo->intf->dev, "starting scan\n");
+       scsi_scan_host(shost);
+       dev_dbg(&devinfo->intf->dev, "scan complete\n");
+}
+
  static void uas_add_work(struct uas_cmd_info *cmdinfo)
  {
         struct scsi_pointer *scp = (void *)cmdinfo;
@@ -982,6 +994,7 @@ static int uas_probe(struct usb_interface *intf, const struct usb_device_id *id)
         init_usb_anchor(&devinfo->data_urbs);
         spin_lock_init(&devinfo->lock);
         INIT_WORK(&devinfo->work, uas_do_work);
+       INIT_WORK(&devinfo->scan_work, uas_scan_work);
  
         result = uas_configure_endpoints(devinfo);
         if (result)
@@ -998,7 +1011,9 @@ static int uas_probe(struct usb_interface *intf, const struct usb_device_id *id)
         if (result)
                 goto free_streams;
  
-       scsi_scan_host(shost);
+       /* Submit the delayed_work for SCSI-device scanning */
+       schedule_work(&devinfo->scan_work);
+
         return result;
  
  free_streams:
@@ -1166,6 +1181,12 @@ static void uas_disconnect(struct usb_interface *intf)
         usb_kill_anchored_urbs(&devinfo->data_urbs);
         uas_zap_pending(devinfo, DID_NO_CONNECT);
  
+       /*
+        * Prevent SCSI scanning (if it hasn't started yet)
+        * or wait for the SCSI-scanning routine to stop.
+        */
+       cancel_work_sync(&devinfo->scan_work);
+
         scsi_remove_host(shost);
         uas_free_streams(devinfo);
         scsi_host_put(shost);
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c

index e158159671fa27a46f34c463a70c3cb461732c6b..18e205eeb9af7c8e20aec27d46d2781aebf9ae5d 100644 (file)
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1414,10 +1414,6 @@ static int vhost_net_release(struct inode *inode, struct file *f)
  
  static struct socket *get_raw_socket(int fd)
  {
-       struct {
-               struct sockaddr_ll sa;
-               char  buf[MAX_ADDR_LEN];
-       } uaddr;
         int r;
         struct socket *sock = sockfd_lookup(fd, &r);
  
@@ -1430,11 +1426,7 @@ static struct socket *get_raw_socket(int fd)
                 goto err;
         }
  
-       r = sock->ops->getname(sock, (struct sockaddr *)&uaddr.sa, 0);
-       if (r < 0)
-               goto err;
-
-       if (uaddr.sa.sll_family != AF_PACKET) {
+       if (sock->sk->sk_family != AF_PACKET) {
                 r = -EPFNOSUPPORT;
                 goto err;
         }
diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig

index cec868f8db3f9645ac9bd7741f7f60e28c86bbb6..9ea2b43d4b012aebc8571b3b713011733e3d332c 100644 (file)
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -207,6 +207,7 @@ config DA9063_WATCHDOG
  config DA9062_WATCHDOG
         tristate "Dialog DA9062/61 Watchdog"
         depends on MFD_DA9062 || COMPILE_TEST
+       depends on I2C
         select WATCHDOG_CORE
         help
           Support for the watchdog in the DA9062 and DA9061 PMICs.
@@ -841,6 +842,7 @@ config MEDIATEK_WATCHDOG
         tristate "Mediatek SoCs watchdog support"
         depends on ARCH_MEDIATEK || COMPILE_TEST
         select WATCHDOG_CORE
+       select RESET_CONTROLLER
         help
           Say Y here to include support for the watchdog timer
           in Mediatek SoCs.
diff --git a/drivers/watchdog/da9062_wdt.c b/drivers/watchdog/da9062_wdt.c

index 47eefe072b405ff055ef830e6c831464d39de6e4..0ad15d55071ce5931f8c5b381c29add6017e9683 100644 (file)
--- a/drivers/watchdog/da9062_wdt.c
+++ b/drivers/watchdog/da9062_wdt.c
@@ -16,6 +16,7 @@
  #include <linux/jiffies.h>
  #include <linux/mfd/da9062/registers.h>
  #include <linux/mfd/da9062/core.h>
+#include <linux/property.h>
  #include <linux/regmap.h>
  #include <linux/of.h>
  
@@ -31,6 +32,7 @@ static const unsigned int wdt_timeout[] = { 0, 2, 4, 8, 16, 32, 65, 131 };
  struct da9062_watchdog {
         struct da9062 *hw;
         struct watchdog_device wdtdev;
+       bool use_sw_pm;
  };
  
  static unsigned int da9062_wdt_timeout_to_sel(unsigned int secs)
@@ -95,13 +97,6 @@ static int da9062_wdt_stop(struct watchdog_device *wdd)
         struct da9062_watchdog *wdt = watchdog_get_drvdata(wdd);
         int ret;
  
-       ret = da9062_reset_watchdog_timer(wdt);
-       if (ret) {
-               dev_err(wdt->hw->dev, "Failed to ping the watchdog (err = %d)\n",
-                       ret);
-               return ret;
-       }
-
         ret = regmap_update_bits(wdt->hw->regmap,
                                  DA9062AA_CONTROL_D,
                                  DA9062AA_TWDSCALE_MASK,
@@ -200,6 +195,8 @@ static int da9062_wdt_probe(struct platform_device *pdev)
         if (!wdt)
                 return -ENOMEM;
  
+       wdt->use_sw_pm = device_property_present(dev, "dlg,use-sw-pm");
+
         wdt->hw = chip;
  
         wdt->wdtdev.info = &da9062_watchdog_info;
@@ -226,6 +223,10 @@ static int da9062_wdt_probe(struct platform_device *pdev)
  static int __maybe_unused da9062_wdt_suspend(struct device *dev)
  {
         struct watchdog_device *wdd = dev_get_drvdata(dev);
+       struct da9062_watchdog *wdt = watchdog_get_drvdata(wdd);
+
+       if (!wdt->use_sw_pm)
+               return 0;
  
         if (watchdog_active(wdd))
                 return da9062_wdt_stop(wdd);
@@ -236,6 +237,10 @@ static int __maybe_unused da9062_wdt_suspend(struct device *dev)
  static int __maybe_unused da9062_wdt_resume(struct device *dev)
  {
         struct watchdog_device *wdd = dev_get_drvdata(dev);
+       struct da9062_watchdog *wdt = watchdog_get_drvdata(wdd);
+
+       if (!wdt->use_sw_pm)
+               return 0;
  
         if (watchdog_active(wdd))
                 return da9062_wdt_start(wdd);
diff --git a/drivers/watchdog/wdat_wdt.c b/drivers/watchdog/wdat_wdt.c

index b069349b52f55f92005bd4d6429f2a1a95def739..3065dd670a18289df900c07c93f513c4d9c8f598 100644 (file)
--- a/drivers/watchdog/wdat_wdt.c
+++ b/drivers/watchdog/wdat_wdt.c
@@ -54,6 +54,13 @@ module_param(nowayout, bool, 0);
  MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started (default="
                  __MODULE_STRING(WATCHDOG_NOWAYOUT) ")");
  
+#define WDAT_DEFAULT_TIMEOUT   30
+
+static int timeout = WDAT_DEFAULT_TIMEOUT;
+module_param(timeout, int, 0);
+MODULE_PARM_DESC(timeout, "Watchdog timeout in seconds (default="
+                __MODULE_STRING(WDAT_DEFAULT_TIMEOUT) ")");
+
  static int wdat_wdt_read(struct wdat_wdt *wdat,
          const struct wdat_instruction *instr, u32 *value)
  {
@@ -389,7 +396,7 @@ static int wdat_wdt_probe(struct platform_device *pdev)
  
                 memset(&r, 0, sizeof(r));
                 r.start = gas->address;
-               r.end = r.start + gas->access_width - 1;
+               r.end = r.start + ACPI_ACCESS_BYTE_WIDTH(gas->access_width) - 1;
                 if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_MEMORY) {
                         r.flags = IORESOURCE_MEM;
                 } else if (gas->space_id == ACPI_ADR_SPACE_SYSTEM_IO) {
@@ -438,6 +445,22 @@ static int wdat_wdt_probe(struct platform_device *pdev)
  
         platform_set_drvdata(pdev, wdat);
  
+       /*
+        * Set initial timeout so that userspace has time to configure the
+        * watchdog properly after it has opened the device. In some cases
+        * the BIOS default is too short and causes immediate reboot.
+        */
+       if (timeout * 1000 < wdat->wdd.min_hw_heartbeat_ms ||
+           timeout * 1000 > wdat->wdd.max_hw_heartbeat_ms) {
+               dev_warn(dev, "Invalid timeout %d given, using %d\n",
+                        timeout, WDAT_DEFAULT_TIMEOUT);
+               timeout = WDAT_DEFAULT_TIMEOUT;
+       }
+
+       ret = wdat_wdt_set_timeout(&wdat->wdd, timeout);
+       if (ret)
+               return ret;
+
         watchdog_set_nowayout(&wdat->wdd, nowayout);
         return devm_watchdog_register_device(dev, &wdat->wdd);
  }
diff --git a/drivers/xen/preempt.c b/drivers/xen/preempt.c

index 70650b248de5d43dae2b4ae01d1fbebfaa6d4c10..17240c5325a30c799478f1e13df0e503dfa71c09 100644 (file)
--- a/drivers/xen/preempt.c
+++ b/drivers/xen/preempt.c
@@ -33,7 +33,9 @@ asmlinkage __visible void xen_maybe_preempt_hcall(void)
                  * cpu.
                  */
                 __this_cpu_write(xen_in_preemptible_hcall, false);
-               _cond_resched();
+               local_irq_enable();
+               cond_resched();
+               local_irq_disable();
                 __this_cpu_write(xen_in_preemptible_hcall, true);
         }
  }
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c

index 7fa9bb79ad08e2c5b732185c902bcbbea549a95f..c6c9a6a8e6c84176c3c8223827fc5fcdc9e6bd79 100644 (file)
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3164,6 +3164,7 @@ int __cold open_ctree(struct super_block *sb,
         /* do not make disk changes in broken FS or nologreplay is given */
         if (btrfs_super_log_root(disk_super) != 0 &&
             !btrfs_test_opt(fs_info, NOLOGREPLAY)) {
+               btrfs_info(fs_info, "start tree-log replay");
                 ret = btrfs_replay_log(fs_info, fs_devices);
                 if (ret) {
                         err = ret;
@@ -3199,6 +3200,7 @@ int __cold open_ctree(struct super_block *sb,
         if (IS_ERR(fs_info->fs_root)) {
                 err = PTR_ERR(fs_info->fs_root);
                 btrfs_warn(fs_info, "failed to read fs tree: %d", err);
+               fs_info->fs_root = NULL;
                 goto fail_qgroup;
         }
  
@@ -4275,6 +4277,7 @@ static int btrfs_destroy_delayed_refs(struct btrfs_transaction *trans,
                 cond_resched();
                 spin_lock(&delayed_refs->lock);
         }
+       btrfs_qgroup_destroy_extent_records(trans);
  
         spin_unlock(&delayed_refs->lock);
  
@@ -4500,7 +4503,6 @@ void btrfs_cleanup_one_transaction(struct btrfs_transaction *cur_trans,
         wake_up(&fs_info->transaction_wait);
  
         btrfs_destroy_delayed_inodes(fs_info);
-       btrfs_assert_delayed_root_empty(fs_info);
  
         btrfs_destroy_marked_extents(fs_info, &cur_trans->dirty_pages,
                                      EXTENT_DIRTY);
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c

index 0163fdd59f8f2156fd58c65e8bdbceb3ae5dec3a..a7bc66121330e1d4d428c93775c2d832041c7d18 100644 (file)
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -4430,6 +4430,8 @@ int btrfs_alloc_logged_file_extent(struct btrfs_trans_handle *trans,
  
         ret = alloc_reserved_file_extent(trans, 0, root_objectid, 0, owner,
                                          offset, ins, 1);
+       if (ret)
+               btrfs_pin_extent(fs_info, ins->objectid, ins->offset, 1);
         btrfs_put_block_group(block_group);
         return ret;
  }
diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c

index 6f417ff68980f68f26eb89b6689a732d15a5b2bf..bd6229fb2b6f0cae346f72afc4324463ccfac578 100644 (file)
--- a/fs/btrfs/extent_map.c
+++ b/fs/btrfs/extent_map.c
@@ -237,6 +237,17 @@ static void try_merge_map(struct extent_map_tree *tree, struct extent_map *em)
         struct extent_map *merge = NULL;
         struct rb_node *rb;
  
+       /*
+        * We can't modify an extent map that is in the tree and that is being
+        * used by another task, as it can cause that other task to see it in
+        * inconsistent state during the merging. We always have 1 reference for
+        * the tree and 1 for this task (which is unpinning the extent map or
+        * clearing the logging flag), so anything > 2 means it's being used by
+        * other tasks too.
+        */
+       if (refcount_read(&em->refs) > 2)
+               return;
+
         if (em->start != 0) {
                 rb = rb_prev(&em->rb_node);
                 if (rb)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c

index 5b3ec93ff911d7af07127fe3a1824024eedcf511..1ccb3f8d528d9e84d3299b2c1fe9b7be51a77121 100644 (file)
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4085,6 +4085,8 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
         u64 bytes_deleted = 0;
         bool be_nice = false;
         bool should_throttle = false;
+       const u64 lock_start = ALIGN_DOWN(new_size, fs_info->sectorsize);
+       struct extent_state *cached_state = NULL;
  
         BUG_ON(new_size > 0 && min_type != BTRFS_EXTENT_DATA_KEY);
  
@@ -4101,6 +4103,10 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans,
                 return -ENOMEM;
         path->reada = READA_BACK;
  
+       if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID)
+               lock_extent_bits(&BTRFS_I(inode)->io_tree, lock_start, (u64)-1,
+                                &cached_state);
+
         /*
          * We want to drop from the next block forward in case this new size is
          * not block aligned since we will be keeping the last block of the
@@ -4137,7 +4143,6 @@ search_again:
                 goto out;
         }
  
-       path->leave_spinning = 1;
         ret = btrfs_search_slot(trans, root, &key, path, -1, 1);
         if (ret < 0)
                 goto out;
@@ -4289,7 +4294,6 @@ delete:
                      root == fs_info->tree_root)) {
                         struct btrfs_ref ref = { 0 };
  
-                       btrfs_set_path_blocking(path);
                         bytes_deleted += extent_num_bytes;
  
                         btrfs_init_generic_ref(&ref, BTRFS_DROP_DELAYED_REF,
@@ -4365,6 +4369,8 @@ out:
                 if (!ret && last_size > new_size)
                         last_size = new_size;
                 btrfs_ordered_update_i_size(inode, last_size, NULL);
+               unlock_extent_cached(&BTRFS_I(inode)->io_tree, lock_start,
+                                    (u64)-1, &cached_state);
         }
  
         btrfs_free_path(path);
@@ -9818,6 +9824,7 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode,
         struct btrfs_root *root = BTRFS_I(inode)->root;
         struct btrfs_key ins;
         u64 cur_offset = start;
+       u64 clear_offset = start;
         u64 i_size;
         u64 cur_bytes;
         u64 last_alloc = (u64)-1;
@@ -9852,6 +9859,15 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode,
                                 btrfs_end_transaction(trans);
                         break;
                 }
+
+               /*
+                * We've reserved this space, and thus converted it from
+                * ->bytes_may_use to ->bytes_reserved.  Any error that happens
+                * from here on out we will only need to clear our reservation
+                * for the remaining unreserved area, so advance our
+                * clear_offset by our extent size.
+                */
+               clear_offset += ins.offset;
                 btrfs_dec_block_group_reservations(fs_info, ins.objectid);
  
                 last_alloc = ins.offset;
@@ -9931,9 +9947,9 @@ next:
                 if (own_trans)
                         btrfs_end_transaction(trans);
         }
-       if (cur_offset < end)
-               btrfs_free_reserved_data_space(inode, NULL, cur_offset,
-                       end - cur_offset + 1);
+       if (clear_offset < end)
+               btrfs_free_reserved_data_space(inode, NULL, clear_offset,
+                       end - clear_offset + 1);
         return ret;
  }
  
diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c

index ecb9fb6a6fe07f3550efe9e8bb9de463585930d3..a65f189a5b9418e17e28ef90dfb7508e13b173be 100644 (file)
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -679,10 +679,15 @@ int btrfs_wait_ordered_range(struct inode *inode, u64 start, u64 len)
                 }
                 btrfs_start_ordered_extent(inode, ordered, 1);
                 end = ordered->file_offset;
+               /*
+                * If the ordered extent had an error save the error but don't
+                * exit without waiting first for all other ordered extents in
+                * the range to complete.
+                */
                 if (test_bit(BTRFS_ORDERED_IOERR, &ordered->flags))
                         ret = -EIO;
                 btrfs_put_ordered_extent(ordered);
-               if (ret || end == 0 || end == start)
+               if (end == 0 || end == start)
                         break;
                 end--;
         }
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c

index 98d9a50352d6d2bf6314cf4dc6720303e53e496e..ff1870ff3474a71f1fada625e17effa09917a074 100644 (file)
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -4002,3 +4002,16 @@ out:
         }
         return ret;
  }
+
+void btrfs_qgroup_destroy_extent_records(struct btrfs_transaction *trans)
+{
+       struct btrfs_qgroup_extent_record *entry;
+       struct btrfs_qgroup_extent_record *next;
+       struct rb_root *root;
+
+       root = &trans->delayed_refs.dirty_extent_root;
+       rbtree_postorder_for_each_entry_safe(entry, next, root, node) {
+               ulist_free(entry->old_roots);
+               kfree(entry);
+       }
+}
diff --git a/fs/btrfs/qgroup.h b/fs/btrfs/qgroup.h

index 236f12224d5205a591251e317cf996dd23fc1e8c..1bc65445946907c171eccc0fe1363b8ce2f0b754 100644 (file)
--- a/fs/btrfs/qgroup.h
+++ b/fs/btrfs/qgroup.h
@@ -414,5 +414,6 @@ int btrfs_qgroup_add_swapped_blocks(struct btrfs_trans_handle *trans,
                 u64 last_snapshot);
  int btrfs_qgroup_trace_subtree_after_cow(struct btrfs_trans_handle *trans,
                 struct btrfs_root *root, struct extent_buffer *eb);
+void btrfs_qgroup_destroy_extent_records(struct btrfs_transaction *trans);
  
  #endif
diff --git a/fs/btrfs/ref-verify.c b/fs/btrfs/ref-verify.c

index b57f3618e58e305f5520dbf7f472df816180b8e4..454a1015d026b741ec1a1b168375755d6df5971a 100644 (file)
--- a/fs/btrfs/ref-verify.c
+++ b/fs/btrfs/ref-verify.c
@@ -744,6 +744,7 @@ int btrfs_ref_tree_mod(struct btrfs_fs_info *fs_info,
                  */
                 be = add_block_entry(fs_info, bytenr, num_bytes, ref_root);
                 if (IS_ERR(be)) {
+                       kfree(ref);
                         kfree(ra);
                         ret = PTR_ERR(be);
                         goto out;
@@ -757,6 +758,8 @@ int btrfs_ref_tree_mod(struct btrfs_fs_info *fs_info,
                         "re-allocated a block that still has references to it!");
                         dump_block_entry(fs_info, be);
                         dump_ref_action(fs_info, ra);
+                       kfree(ref);
+                       kfree(ra);
                         goto out_unlock;
                 }
  
@@ -819,6 +822,7 @@ int btrfs_ref_tree_mod(struct btrfs_fs_info *fs_info,
  "dropping a ref for a existing root that doesn't have a ref on the block");
                                 dump_block_entry(fs_info, be);
                                 dump_ref_action(fs_info, ra);
+                               kfree(ref);
                                 kfree(ra);
                                 goto out_unlock;
                         }
@@ -834,6 +838,7 @@ int btrfs_ref_tree_mod(struct btrfs_fs_info *fs_info,
  "attempting to add another ref for an existing ref on a tree block");
                         dump_block_entry(fs_info, be);
                         dump_ref_action(fs_info, ra);
+                       kfree(ref);
                         kfree(ra);
                         goto out_unlock;
                 }
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c

index 0616a5434793d100ffa1db67c0afe961c687ba51..67c63858812a9ebc8e3e9448df46968b58c3c3bd 100644 (file)
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1834,6 +1834,8 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
                 }
  
                 if (btrfs_super_log_root(fs_info->super_copy) != 0) {
+                       btrfs_warn(fs_info,
+               "mount required to replay tree-log, cannot remount read-write");
                         ret = -EINVAL;
                         goto restore;
                 }
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c

index 7436422194da32503d87aea85cf3d116f16eb78a..3c10e78924d04db95ad6c70c5f9e194557be5100 100644 (file)
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -901,6 +901,12 @@ static int addrm_unknown_feature_attrs(struct btrfs_fs_info *fs_info, bool add)
  
  static void __btrfs_sysfs_remove_fsid(struct btrfs_fs_devices *fs_devs)
  {
+       if (fs_devs->devinfo_kobj) {
+               kobject_del(fs_devs->devinfo_kobj);
+               kobject_put(fs_devs->devinfo_kobj);
+               fs_devs->devinfo_kobj = NULL;
+       }
+
         if (fs_devs->devices_kobj) {
                 kobject_del(fs_devs->devices_kobj);
                 kobject_put(fs_devs->devices_kobj);
@@ -1289,7 +1295,7 @@ int btrfs_sysfs_add_device_link(struct btrfs_fs_devices *fs_devices,
  
                 init_completion(&dev->kobj_unregister);
                 error = kobject_init_and_add(&dev->devid_kobj, &devid_ktype,
-                                            fs_devices->devices_kobj, "%llu",
+                                            fs_devices->devinfo_kobj, "%llu",
                                              dev->devid);
                 if (error) {
                         kobject_put(&dev->devid_kobj);
@@ -1369,6 +1375,15 @@ int btrfs_sysfs_add_fsid(struct btrfs_fs_devices *fs_devs)
                 return -ENOMEM;
         }
  
+       fs_devs->devinfo_kobj = kobject_create_and_add("devinfo",
+                                                      &fs_devs->fsid_kobj);
+       if (!fs_devs->devinfo_kobj) {
+               btrfs_err(fs_devs->fs_info,
+                         "failed to init sysfs devinfo kobject");
+               btrfs_sysfs_remove_fsid(fs_devs);
+               return -ENOMEM;
+       }
+
         return 0;
  }
  
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c

index 33dcc88b428ad402ed68c1dd6bbfb9dc9ee69d23..beb6c69cd1e55b7d0fb3089b4822269fb6bce38c 100644 (file)
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -121,6 +121,8 @@ void btrfs_put_transaction(struct btrfs_transaction *transaction)
                 BUG_ON(!list_empty(&transaction->list));
                 WARN_ON(!RB_EMPTY_ROOT(
                                 &transaction->delayed_refs.href_root.rb_root));
+               WARN_ON(!RB_EMPTY_ROOT(
+                               &transaction->delayed_refs.dirty_extent_root));
                 if (transaction->delayed_refs.pending_csums)
                         btrfs_err(transaction->fs_info,
                                   "pending csums is %llu",
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h

index 409f4816fb89c458e9dec591b0951d79816dd7d1..f01552a0785eb2e570f53a551fb69458b866a937 100644 (file)
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -258,6 +258,7 @@ struct btrfs_fs_devices {
         /* sysfs kobjects */
         struct kobject fsid_kobj;
         struct kobject *devices_kobj;
+       struct kobject *devinfo_kobj;
         struct completion kobj_unregister;
  };
  
diff --git a/fs/ceph/file.c b/fs/ceph/file.c

index c3b8e8e0bf17d52c1c2bd90ba093813e22d71906..7e0190b1f821e73a33d252b9e8491bba3cdcaf16 100644 (file)
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -1418,6 +1418,7 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from)
         struct ceph_cap_flush *prealloc_cf;
         ssize_t count, written = 0;
         int err, want, got;
+       bool direct_lock = false;
         loff_t pos;
         loff_t limit = max(i_size_read(inode), fsc->max_file_size);
  
@@ -1428,8 +1429,11 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from)
         if (!prealloc_cf)
                 return -ENOMEM;
  
+       if ((iocb->ki_flags & (IOCB_DIRECT | IOCB_APPEND)) == IOCB_DIRECT)
+               direct_lock = true;
+
  retry_snap:
-       if (iocb->ki_flags & IOCB_DIRECT)
+       if (direct_lock)
                 ceph_start_io_direct(inode);
         else
                 ceph_start_io_write(inode);
@@ -1519,14 +1523,15 @@ retry_snap:
  
                 /* we might need to revert back to that point */
                 data = *from;
-               if (iocb->ki_flags & IOCB_DIRECT) {
+               if (iocb->ki_flags & IOCB_DIRECT)
                         written = ceph_direct_read_write(iocb, &data, snapc,
                                                          &prealloc_cf);
-                       ceph_end_io_direct(inode);
-               } else {
+               else
                         written = ceph_sync_write(iocb, &data, pos, snapc);
+               if (direct_lock)
+                       ceph_end_io_direct(inode);
+               else
                         ceph_end_io_write(inode);
-               }
                 if (written > 0)
                         iov_iter_advance(from, written);
                 ceph_put_snap_context(snapc);
@@ -1577,7 +1582,7 @@ retry_snap:
  
         goto out_unlocked;
  out:
-       if (iocb->ki_flags & IOCB_DIRECT)
+       if (direct_lock)
                 ceph_end_io_direct(inode);
         else
                 ceph_end_io_write(inode);
diff --git a/fs/ceph/super.c b/fs/ceph/super.c

index 1d9f083b8a1153f361a6abff6dd8370aad23b587..c7f150686a53d41295f0e6a1b718d81e7a587344 100644 (file)
--- a/fs/ceph/super.c
+++ b/fs/ceph/super.c
@@ -202,6 +202,26 @@ struct ceph_parse_opts_ctx {
         struct ceph_mount_options       *opts;
  };
  
+/*
+ * Remove adjacent slashes and then the trailing slash, unless it is
+ * the only remaining character.
+ *
+ * E.g. "//dir1////dir2///" --> "/dir1/dir2", "///" --> "/".
+ */
+static void canonicalize_path(char *path)
+{
+       int i, j = 0;
+
+       for (i = 0; path[i] != '\0'; i++) {
+               if (path[i] != '/' || j < 1 || path[j - 1] != '/')
+                       path[j++] = path[i];
+       }
+
+       if (j > 1 && path[j - 1] == '/')
+               j--;
+       path[j] = '\0';
+}
+
  /*
   * Parse the source parameter.  Distinguish the server list from the path.
   *
@@ -224,15 +244,16 @@ static int ceph_parse_source(struct fs_parameter *param, struct fs_context *fc)
  
         dev_name_end = strchr(dev_name, '/');
         if (dev_name_end) {
-               kfree(fsopt->server_path);
-
                 /*
                  * The server_path will include the whole chars from userland
                  * including the leading '/'.
                  */
+               kfree(fsopt->server_path);
                 fsopt->server_path = kstrdup(dev_name_end, GFP_KERNEL);
                 if (!fsopt->server_path)
                         return -ENOMEM;
+
+               canonicalize_path(fsopt->server_path);
         } else {
                 dev_name_end = dev_name + strlen(dev_name);
         }
@@ -456,73 +477,6 @@ static int strcmp_null(const char *s1, const char *s2)
         return strcmp(s1, s2);
  }
  
-/**
- * path_remove_extra_slash - Remove the extra slashes in the server path
- * @server_path: the server path and could be NULL
- *
- * Return NULL if the path is NULL or only consists of "/", or a string
- * without any extra slashes including the leading slash(es) and the
- * slash(es) at the end of the server path, such as:
- * "//dir1////dir2///" --> "dir1/dir2"
- */
-static char *path_remove_extra_slash(const char *server_path)
-{
-       const char *path = server_path;
-       const char *cur, *end;
-       char *buf, *p;
-       int len;
-
-       /* if the server path is omitted */
-       if (!path)
-               return NULL;
-
-       /* remove all the leading slashes */
-       while (*path == '/')
-               path++;
-
-       /* if the server path only consists of slashes */
-       if (*path == '\0')
-               return NULL;
-
-       len = strlen(path);
-
-       buf = kmalloc(len + 1, GFP_KERNEL);
-       if (!buf)
-               return ERR_PTR(-ENOMEM);
-
-       end = path + len;
-       p = buf;
-       do {
-               cur = strchr(path, '/');
-               if (!cur)
-                       cur = end;
-
-               len = cur - path;
-
-               /* including one '/' */
-               if (cur != end)
-                       len += 1;
-
-               memcpy(p, path, len);
-               p += len;
-
-               while (cur <= end && *cur == '/')
-                       cur++;
-               path = cur;
-       } while (path < end);
-
-       *p = '\0';
-
-       /*
-        * remove the last slash if there has and just to make sure that
-        * we will get something like "dir1/dir2"
-        */
-       if (*(--p) == '/')
-               *p = '\0';
-
-       return buf;
-}
-
  static int compare_mount_options(struct ceph_mount_options *new_fsopt,
                                  struct ceph_options *new_opt,
                                  struct ceph_fs_client *fsc)
@@ -530,7 +484,6 @@ static int compare_mount_options(struct ceph_mount_options *new_fsopt,
         struct ceph_mount_options *fsopt1 = new_fsopt;
         struct ceph_mount_options *fsopt2 = fsc->mount_options;
         int ofs = offsetof(struct ceph_mount_options, snapdir_name);
-       char *p1, *p2;
         int ret;
  
         ret = memcmp(fsopt1, fsopt2, ofs);
@@ -540,21 +493,12 @@ static int compare_mount_options(struct ceph_mount_options *new_fsopt,
         ret = strcmp_null(fsopt1->snapdir_name, fsopt2->snapdir_name);
         if (ret)
                 return ret;
+
         ret = strcmp_null(fsopt1->mds_namespace, fsopt2->mds_namespace);
         if (ret)
                 return ret;
  
-       p1 = path_remove_extra_slash(fsopt1->server_path);
-       if (IS_ERR(p1))
-               return PTR_ERR(p1);
-       p2 = path_remove_extra_slash(fsopt2->server_path);
-       if (IS_ERR(p2)) {
-               kfree(p1);
-               return PTR_ERR(p2);
-       }
-       ret = strcmp_null(p1, p2);
-       kfree(p1);
-       kfree(p2);
+       ret = strcmp_null(fsopt1->server_path, fsopt2->server_path);
         if (ret)
                 return ret;
  
@@ -957,7 +901,9 @@ static struct dentry *ceph_real_mount(struct ceph_fs_client *fsc,
         mutex_lock(&fsc->client->mount_mutex);
  
         if (!fsc->sb->s_root) {
-               const char *path, *p;
+               const char *path = fsc->mount_options->server_path ?
+                                    fsc->mount_options->server_path + 1 : "";
+
                 err = __ceph_open_session(fsc->client, started);
                 if (err < 0)
                         goto out;
@@ -969,22 +915,11 @@ static struct dentry *ceph_real_mount(struct ceph_fs_client *fsc,
                                 goto out;
                 }
  
-               p = path_remove_extra_slash(fsc->mount_options->server_path);
-               if (IS_ERR(p)) {
-                       err = PTR_ERR(p);
-                       goto out;
-               }
-               /* if the server path is omitted or just consists of '/' */
-               if (!p)
-                       path = "";
-               else
-                       path = p;
                 dout("mount opening path '%s'\n", path);
  
                 ceph_fs_debugfs_init(fsc);
  
                 root = open_root_dentry(fsc, path, started);
-               kfree(p);
                 if (IS_ERR(root)) {
                         err = PTR_ERR(root);
                         goto out;
@@ -1097,10 +1032,6 @@ static int ceph_get_tree(struct fs_context *fc)
         if (!fc->source)
                 return invalfc(fc, "No source");
  
-#ifdef CONFIG_CEPH_FS_POSIX_ACL
-       fc->sb_flags |= SB_POSIXACL;
-#endif
-
         /* create client (which we may/may not use) */
         fsc = create_fs_client(pctx->opts, pctx->copts);
         pctx->opts = NULL;
@@ -1223,6 +1154,10 @@ static int ceph_init_fs_context(struct fs_context *fc)
         fsopt->max_readdir_bytes = CEPH_MAX_READDIR_BYTES_DEFAULT;
         fsopt->congestion_kb = default_congestion_kb();
  
+#ifdef CONFIG_CEPH_FS_POSIX_ACL
+       fc->sb_flags |= SB_POSIXACL;
+#endif
+
         fc->fs_private = pctx;
         fc->ops = &ceph_context_ops;
         return 0;
diff --git a/fs/ceph/super.h b/fs/ceph/super.h

index 1e456a9011bb73f5717a08cfe4128ae41bda1f59..037cdfb2ad4f51c71885abf93d1d0a1a0ebfccbc 100644 (file)
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -91,7 +91,7 @@ struct ceph_mount_options {
  
         char *snapdir_name;   /* default ".snap" */
         char *mds_namespace;  /* default NULL */
-       char *server_path;    /* default  "/" */
+       char *server_path;    /* default NULL (means "/") */
         char *fscache_uniq;   /* default NULL */
  };
  
diff --git a/fs/cifs/cifs_dfs_ref.c b/fs/cifs/cifs_dfs_ref.c

index 606f26d862dc18b1326e5c3cf4a9e1c9a746ac70..cc3ada12848d952466cedf9eb1c8dc03abcc1eae 100644 (file)
--- a/fs/cifs/cifs_dfs_ref.c
+++ b/fs/cifs/cifs_dfs_ref.c
@@ -324,6 +324,8 @@ static struct vfsmount *cifs_dfs_do_automount(struct dentry *mntpt)
         if (full_path == NULL)
                 goto cdda_exit;
  
+       convert_delimiter(full_path, '\\');
+
         cifs_dbg(FYI, "%s: full_path: %s\n", __func__, full_path);
  
         if (!cifs_sb_master_tlink(cifs_sb)) {
diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c

index 440828afcddef783d6d3edf087ddede0f3974b02..716574aab3b6a88c823a1511569511bb6491a94a 100644 (file)
--- a/fs/cifs/cifsacl.c
+++ b/fs/cifs/cifsacl.c
@@ -601,7 +601,7 @@ static void access_flags_to_mode(__le32 ace_flags, int type, umode_t *pmode,
                         ((flags & FILE_EXEC_RIGHTS) == FILE_EXEC_RIGHTS))
                 *pmode |= (S_IXUGO & (*pbits_to_set));
  
-       cifs_dbg(NOISY, "access flags 0x%x mode now 0x%x\n", flags, *pmode);
+       cifs_dbg(NOISY, "access flags 0x%x mode now %04o\n", flags, *pmode);
         return;
  }
  
@@ -630,7 +630,7 @@ static void mode_to_access_flags(umode_t mode, umode_t bits_to_use,
         if (mode & S_IXUGO)
                 *pace_flags |= SET_FILE_EXEC_RIGHTS;
  
-       cifs_dbg(NOISY, "mode: 0x%x, access flags now 0x%x\n",
+       cifs_dbg(NOISY, "mode: %04o, access flags now 0x%x\n",
                  mode, *pace_flags);
         return;
  }
diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c

index febab27cd8389c5c4149894d3cb47e5845249c95..fa77fe5258b0726ced3bc4a29180590a93c12aaf 100644 (file)
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -414,7 +414,7 @@ cifs_show_security(struct seq_file *s, struct cifs_ses *ses)
                 seq_puts(s, "ntlm");
                 break;
         case Kerberos:
-               seq_printf(s, "krb5,cruid=%u", from_kuid_munged(&init_user_ns,ses->cred_uid));
+               seq_puts(s, "krb5");
                 break;
         case RawNTLMSSP:
                 seq_puts(s, "ntlmssp");
@@ -427,6 +427,10 @@ cifs_show_security(struct seq_file *s, struct cifs_ses *ses)
  
         if (ses->sign)
                 seq_puts(s, "i");
+
+       if (ses->sectype == Kerberos)
+               seq_printf(s, ",cruid=%u",
+                          from_kuid_munged(&init_user_ns, ses->cred_uid));
  }
  
  static void
@@ -526,6 +530,8 @@ cifs_show_options(struct seq_file *s, struct dentry *root)
  
         if (tcon->seal)
                 seq_puts(s, ",seal");
+       else if (tcon->ses->server->ignore_signature)
+               seq_puts(s, ",signloosely");
         if (tcon->nocase)
                 seq_puts(s, ",nocase");
         if (tcon->local_lease)
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h

index de82cfa44b1ae6e65ac5859676d729513352fab1..0d956360e984724224a5d28333936499107dd5a5 100644 (file)
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -1281,6 +1281,7 @@ struct cifs_fid {
         __u64 volatile_fid;     /* volatile file id for smb2 */
         __u8 lease_key[SMB2_LEASE_KEY_SIZE];    /* lease key for smb2 */
         __u8 create_guid[16];
+       __u32 access;
         struct cifs_pending_open *pending_open;
         unsigned int epoch;
  #ifdef CONFIG_CIFS_DEBUG2
@@ -1741,6 +1742,12 @@ static inline bool is_retryable_error(int error)
         return false;
  }
  
+
+/* cifs_get_writable_file() flags */
+#define FIND_WR_ANY         0
+#define FIND_WR_FSUID_ONLY  1
+#define FIND_WR_WITH_DELETE 2
+
  #define   MID_FREE 0
  #define   MID_REQUEST_ALLOCATED 1
  #define   MID_REQUEST_SUBMITTED 2
diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h

index 89eaaf46d1cadcab1c719b671d774a0a92efa12e..e5cb681ec13815aacd59dd3a3ce751ba06358765 100644 (file)
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -134,11 +134,12 @@ extern bool backup_cred(struct cifs_sb_info *);
  extern bool is_size_safe_to_change(struct cifsInodeInfo *, __u64 eof);
  extern void cifs_update_eof(struct cifsInodeInfo *cifsi, loff_t offset,
                             unsigned int bytes_written);
-extern struct cifsFileInfo *find_writable_file(struct cifsInodeInfo *, bool);
+extern struct cifsFileInfo *find_writable_file(struct cifsInodeInfo *, int);
  extern int cifs_get_writable_file(struct cifsInodeInfo *cifs_inode,
-                                 bool fsuid_only,
+                                 int flags,
                                   struct cifsFileInfo **ret_file);
  extern int cifs_get_writable_path(struct cifs_tcon *tcon, const char *name,
+                                 int flags,
                                   struct cifsFileInfo **ret_file);
  extern struct cifsFileInfo *find_readable_file(struct cifsInodeInfo *, bool);
  extern int cifs_get_readable_path(struct cifs_tcon *tcon, const char *name,
diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c

index 3c89569e721082895ebcba2f7f4bb33d584ec901..6f6fb3606a5d6094b498c99851d21d81b135b952 100644 (file)
--- a/fs/cifs/cifssmb.c
+++ b/fs/cifs/cifssmb.c
@@ -1492,6 +1492,7 @@ openRetry:
         *oplock = rsp->OplockLevel;
         /* cifs fid stays in le */
         oparms->fid->netfid = rsp->Fid;
+       oparms->fid->access = desired_access;
  
         /* Let caller know file was created so we can set the mode. */
         /* Do we care about the CreateAction in any other cases? */
@@ -2115,7 +2116,7 @@ cifs_writev_requeue(struct cifs_writedata *wdata)
                 wdata2->tailsz = tailsz;
                 wdata2->bytes = cur_len;
  
-               rc = cifs_get_writable_file(CIFS_I(inode), false,
+               rc = cifs_get_writable_file(CIFS_I(inode), FIND_WR_ANY,
                                             &wdata2->cfile);
                 if (!wdata2->cfile) {
                         cifs_dbg(VFS, "No writable handle to retry writepages rc=%d\n",
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c

index a941ac7a659d9d748799b87c460486cc00997b75..4804d1df8c1cfb675012f214f9fbc906bb6de1e3 100644 (file)
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -4151,7 +4151,7 @@ int cifs_setup_cifs_sb(struct smb_vol *pvolume_info,
         cifs_sb->mnt_gid = pvolume_info->linux_gid;
         cifs_sb->mnt_file_mode = pvolume_info->file_mode;
         cifs_sb->mnt_dir_mode = pvolume_info->dir_mode;
-       cifs_dbg(FYI, "file mode: 0x%hx  dir mode: 0x%hx\n",
+       cifs_dbg(FYI, "file mode: %04ho  dir mode: %04ho\n",
                  cifs_sb->mnt_file_mode, cifs_sb->mnt_dir_mode);
  
         cifs_sb->actimeo = pvolume_info->actimeo;
diff --git a/fs/cifs/file.c b/fs/cifs/file.c

index bc9516ab4b34f22281f52056efab27b117afa558..3b942ecdd4be76c3c1bb5e0c996bf34a41357f60 100644 (file)
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -1958,7 +1958,7 @@ struct cifsFileInfo *find_readable_file(struct cifsInodeInfo *cifs_inode,
  
  /* Return -EBADF if no handle is found and general rc otherwise */
  int
-cifs_get_writable_file(struct cifsInodeInfo *cifs_inode, bool fsuid_only,
+cifs_get_writable_file(struct cifsInodeInfo *cifs_inode, int flags,
                        struct cifsFileInfo **ret_file)
  {
         struct cifsFileInfo *open_file, *inv_file = NULL;
@@ -1966,7 +1966,8 @@ cifs_get_writable_file(struct cifsInodeInfo *cifs_inode, bool fsuid_only,
         bool any_available = false;
         int rc = -EBADF;
         unsigned int refind = 0;
-
+       bool fsuid_only = flags & FIND_WR_FSUID_ONLY;
+       bool with_delete = flags & FIND_WR_WITH_DELETE;
         *ret_file = NULL;
  
         /*
@@ -1998,6 +1999,8 @@ refind_writable:
                         continue;
                 if (fsuid_only && !uid_eq(open_file->uid, current_fsuid()))
                         continue;
+               if (with_delete && !(open_file->fid.access & DELETE))
+                       continue;
                 if (OPEN_FMODE(open_file->f_flags) & FMODE_WRITE) {
                         if (!open_file->invalidHandle) {
                                 /* found a good writable file */
@@ -2045,12 +2048,12 @@ refind_writable:
  }
  
  struct cifsFileInfo *
-find_writable_file(struct cifsInodeInfo *cifs_inode, bool fsuid_only)
+find_writable_file(struct cifsInodeInfo *cifs_inode, int flags)
  {
         struct cifsFileInfo *cfile;
         int rc;
  
-       rc = cifs_get_writable_file(cifs_inode, fsuid_only, &cfile);
+       rc = cifs_get_writable_file(cifs_inode, flags, &cfile);
         if (rc)
                 cifs_dbg(FYI, "couldn't find writable handle rc=%d", rc);
  
@@ -2059,6 +2062,7 @@ find_writable_file(struct cifsInodeInfo *cifs_inode, bool fsuid_only)
  
  int
  cifs_get_writable_path(struct cifs_tcon *tcon, const char *name,
+                      int flags,
                        struct cifsFileInfo **ret_file)
  {
         struct list_head *tmp;
@@ -2085,7 +2089,7 @@ cifs_get_writable_path(struct cifs_tcon *tcon, const char *name,
                 kfree(full_path);
                 cinode = CIFS_I(d_inode(cfile->dentry));
                 spin_unlock(&tcon->open_file_lock);
-               return cifs_get_writable_file(cinode, 0, ret_file);
+               return cifs_get_writable_file(cinode, flags, ret_file);
         }
  
         spin_unlock(&tcon->open_file_lock);
@@ -2162,7 +2166,8 @@ static int cifs_partialpagewrite(struct page *page, unsigned from, unsigned to)
         if (mapping->host->i_size - offset < (loff_t)to)
                 to = (unsigned)(mapping->host->i_size - offset);
  
-       rc = cifs_get_writable_file(CIFS_I(mapping->host), false, &open_file);
+       rc = cifs_get_writable_file(CIFS_I(mapping->host), FIND_WR_ANY,
+                                   &open_file);
         if (!rc) {
                 bytes_written = cifs_write(open_file, open_file->pid,
                                            write_data, to - from, &offset);
@@ -2355,7 +2360,7 @@ retry:
                 if (cfile)
                         cifsFileInfo_put(cfile);
  
-               rc = cifs_get_writable_file(CIFS_I(inode), false, &cfile);
+               rc = cifs_get_writable_file(CIFS_I(inode), FIND_WR_ANY, &cfile);
  
                 /* in case of an error store it to return later */
                 if (rc)
diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c

index 9ba623b601ec2b913d5753240b6c87e2c25dc8ef..1e8a4b1579db49791592fdba227672f01c97d806 100644 (file)
--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -653,8 +653,8 @@ cifs_all_info_to_fattr(struct cifs_fattr *fattr, FILE_ALL_INFO *info,
                  */
                 if ((fattr->cf_nlink < 1) && !tcon->unix_ext &&
                     !info->DeletePending) {
-                       cifs_dbg(1, "bogus file nlink value %u\n",
-                               fattr->cf_nlink);
+                       cifs_dbg(VFS, "bogus file nlink value %u\n",
+                                fattr->cf_nlink);
                         fattr->cf_flags |= CIFS_FATTR_UNKNOWN_NLINK;
                 }
         }
@@ -1648,7 +1648,7 @@ int cifs_mkdir(struct inode *inode, struct dentry *direntry, umode_t mode)
         struct TCP_Server_Info *server;
         char *full_path;
  
-       cifs_dbg(FYI, "In cifs_mkdir, mode = 0x%hx inode = 0x%p\n",
+       cifs_dbg(FYI, "In cifs_mkdir, mode = %04ho inode = 0x%p\n",
                  mode, inode);
  
         cifs_sb = CIFS_SB(inode->i_sb);
@@ -2073,6 +2073,7 @@ int cifs_revalidate_dentry_attr(struct dentry *dentry)
         struct inode *inode = d_inode(dentry);
         struct super_block *sb = dentry->d_sb;
         char *full_path = NULL;
+       int count = 0;
  
         if (inode == NULL)
                 return -ENOENT;
@@ -2094,15 +2095,18 @@ int cifs_revalidate_dentry_attr(struct dentry *dentry)
                  full_path, inode, inode->i_count.counter,
                  dentry, cifs_get_time(dentry), jiffies);
  
+again:
         if (cifs_sb_master_tcon(CIFS_SB(sb))->unix_ext)
                 rc = cifs_get_inode_info_unix(&inode, full_path, sb, xid);
         else
                 rc = cifs_get_inode_info(&inode, full_path, NULL, sb,
                                          xid, NULL);
-
+       if (rc == -EAGAIN && count++ < 10)
+               goto again;
  out:
         kfree(full_path);
         free_xid(xid);
+
         return rc;
  }
  
@@ -2278,7 +2282,7 @@ cifs_set_file_size(struct inode *inode, struct iattr *attrs,
          * writebehind data than the SMB timeout for the SetPathInfo
          * request would allow
          */
-       open_file = find_writable_file(cifsInode, true);
+       open_file = find_writable_file(cifsInode, FIND_WR_FSUID_ONLY);
         if (open_file) {
                 tcon = tlink_tcon(open_file->tlink);
                 server = tcon->ses->server;
@@ -2428,7 +2432,7 @@ cifs_setattr_unix(struct dentry *direntry, struct iattr *attrs)
                 args->ctime = NO_CHANGE_64;
  
         args->device = 0;
-       open_file = find_writable_file(cifsInode, true);
+       open_file = find_writable_file(cifsInode, FIND_WR_FSUID_ONLY);
         if (open_file) {
                 u16 nfid = open_file->fid.netfid;
                 u32 npid = open_file->pid;
@@ -2531,7 +2535,7 @@ cifs_setattr_nounix(struct dentry *direntry, struct iattr *attrs)
         rc = 0;
  
         if (attrs->ia_valid & ATTR_MTIME) {
-               rc = cifs_get_writable_file(cifsInode, false, &wfile);
+               rc = cifs_get_writable_file(cifsInode, FIND_WR_ANY, &wfile);
                 if (!rc) {
                         tcon = tlink_tcon(wfile->tlink);
                         rc = tcon->ses->server->ops->flush(xid, tcon, &wfile->fid);
diff --git a/fs/cifs/smb1ops.c b/fs/cifs/smb1ops.c

index eb994e313c6ae755437f375db1b1bb662c90db75..b130efaf8feb2260da9f4d00fee30fa6e53e599a 100644 (file)
--- a/fs/cifs/smb1ops.c
+++ b/fs/cifs/smb1ops.c
@@ -766,7 +766,7 @@ smb_set_file_info(struct inode *inode, const char *full_path,
         struct cifs_tcon *tcon;
  
         /* if the file is already open for write, just use that fileid */
-       open_file = find_writable_file(cinode, true);
+       open_file = find_writable_file(cinode, FIND_WR_FSUID_ONLY);
         if (open_file) {
                 fid.netfid = open_file->fid.netfid;
                 netpid = open_file->pid;
diff --git a/fs/cifs/smb2inode.c b/fs/cifs/smb2inode.c

index 1cf207564ff9676b6901e951a812267380effe10..a8c301ae00ed72f2bbcbbf0f3bf0bf6dfce0e29b 100644 (file)
--- a/fs/cifs/smb2inode.c
+++ b/fs/cifs/smb2inode.c
@@ -521,7 +521,7 @@ smb2_mkdir_setinfo(struct inode *inode, const char *name,
         cifs_i = CIFS_I(inode);
         dosattrs = cifs_i->cifsAttrs | ATTR_READONLY;
         data.Attributes = cpu_to_le32(dosattrs);
-       cifs_get_writable_path(tcon, name, &cfile);
+       cifs_get_writable_path(tcon, name, FIND_WR_ANY, &cfile);
         tmprc = smb2_compound_op(xid, tcon, cifs_sb, name,
                                  FILE_WRITE_ATTRIBUTES, FILE_CREATE,
                                  CREATE_NOT_FILE, ACL_NO_MODE,
@@ -577,7 +577,7 @@ smb2_rename_path(const unsigned int xid, struct cifs_tcon *tcon,
  {
         struct cifsFileInfo *cfile;
  
-       cifs_get_writable_path(tcon, from_name, &cfile);
+       cifs_get_writable_path(tcon, from_name, FIND_WR_WITH_DELETE, &cfile);
  
         return smb2_set_path_attr(xid, tcon, from_name, to_name,
                                   cifs_sb, DELETE, SMB2_OP_RENAME, cfile);
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c

index baa825f4cec03899d647fd23528fcca8a3b72020..c31e84ee3c397d3d273a74024bcbba3d3a5c59c1 100644 (file)
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -1116,7 +1116,8 @@ smb2_set_ea(const unsigned int xid, struct cifs_tcon *tcon,
         void *data[1];
         struct smb2_file_full_ea_info *ea = NULL;
         struct kvec close_iov[1];
-       int rc;
+       struct smb2_query_info_rsp *rsp;
+       int rc, used_len = 0;
  
         if (smb3_encryption_required(tcon))
                 flags |= CIFS_TRANSFORM_REQ;
@@ -1139,6 +1140,38 @@ smb2_set_ea(const unsigned int xid, struct cifs_tcon *tcon,
                                                              cifs_sb);
                         if (rc == -ENODATA)
                                 goto sea_exit;
+               } else {
+                       /* If we are adding a attribute we should first check
+                        * if there will be enough space available to store
+                        * the new EA. If not we should not add it since we
+                        * would not be able to even read the EAs back.
+                        */
+                       rc = smb2_query_info_compound(xid, tcon, utf16_path,
+                                     FILE_READ_EA,
+                                     FILE_FULL_EA_INFORMATION,
+                                     SMB2_O_INFO_FILE,
+                                     CIFSMaxBufSize -
+                                     MAX_SMB2_CREATE_RESPONSE_SIZE -
+                                     MAX_SMB2_CLOSE_RESPONSE_SIZE,
+                                     &rsp_iov[1], &resp_buftype[1], cifs_sb);
+                       if (rc == 0) {
+                               rsp = (struct smb2_query_info_rsp *)rsp_iov[1].iov_base;
+                               used_len = le32_to_cpu(rsp->OutputBufferLength);
+                       }
+                       free_rsp_buf(resp_buftype[1], rsp_iov[1].iov_base);
+                       resp_buftype[1] = CIFS_NO_BUFFER;
+                       memset(&rsp_iov[1], 0, sizeof(rsp_iov[1]));
+                       rc = 0;
+
+                       /* Use a fudge factor of 256 bytes in case we collide
+                        * with a different set_EAs command.
+                        */
+                       if(CIFSMaxBufSize - MAX_SMB2_CREATE_RESPONSE_SIZE -
+                          MAX_SMB2_CLOSE_RESPONSE_SIZE - 256 <
+                          used_len + ea_name_len + ea_value_len + 1) {
+                               rc = -ENOSPC;
+                               goto sea_exit;
+                       }
                 }
         }
  
@@ -1331,6 +1364,7 @@ smb2_set_fid(struct cifsFileInfo *cfile, struct cifs_fid *fid, __u32 oplock)
  
         cfile->fid.persistent_fid = fid->persistent_fid;
         cfile->fid.volatile_fid = fid->volatile_fid;
+       cfile->fid.access = fid->access;
  #ifdef CONFIG_CIFS_DEBUG2
         cfile->fid.mid = fid->mid;
  #endif /* CIFS_DEBUG2 */
@@ -3294,7 +3328,7 @@ static loff_t smb3_llseek(struct file *file, struct cifs_tcon *tcon, loff_t offs
          * some servers (Windows2016) will not reflect recent writes in
          * QUERY_ALLOCATED_RANGES until SMB2_flush is called.
          */
-       wrcfile = find_writable_file(cifsi, false);
+       wrcfile = find_writable_file(cifsi, FIND_WR_ANY);
         if (wrcfile) {
                 filemap_write_and_wait(inode->i_mapping);
                 smb2_flush_file(xid, tcon, &wrcfile->fid);
@@ -4795,6 +4829,7 @@ struct smb_version_operations smb21_operations = {
         .wp_retry_size = smb2_wp_retry_size,
         .dir_needs_close = smb2_dir_needs_close,
         .enum_snapshots = smb3_enum_snapshots,
+       .notify = smb3_notify,
         .get_dfs_refer = smb2_get_dfs_refer,
         .select_sectype = smb2_select_sectype,
  #ifdef CONFIG_CIFS_XATTR
diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c

index 1234f9ccab0302b5df63a320f12bf7f397be26c9..28c0be5e69b7fcef9683621f353e4142284edbbe 100644 (file)
--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -2771,6 +2771,7 @@ SMB2_open(const unsigned int xid, struct cifs_open_parms *oparms, __le16 *path,
         atomic_inc(&tcon->num_remote_opens);
         oparms->fid->persistent_fid = rsp->PersistentFileId;
         oparms->fid->volatile_fid = rsp->VolatileFileId;
+       oparms->fid->access = oparms->desired_access;
  #ifdef CONFIG_CIFS_DEBUG2
         oparms->fid->mid = le64_to_cpu(rsp->sync_hdr.MessageId);
  #endif /* CIFS_DEBUG2 */
diff --git a/fs/dax.c b/fs/dax.c

index 1f1f0201cad1821a2e2883a10821b02e209a8f11..35da144375a0ad91ed600c06c04b5fc9bf13562b 100644 (file)
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -937,12 +937,11 @@ static int dax_writeback_one(struct xa_state *xas, struct dax_device *dax_dev,
   * on persistent storage prior to completion of the operation.
   */
  int dax_writeback_mapping_range(struct address_space *mapping,
-               struct block_device *bdev, struct writeback_control *wbc)
+               struct dax_device *dax_dev, struct writeback_control *wbc)
  {
         XA_STATE(xas, &mapping->i_pages, wbc->range_start >> PAGE_SHIFT);
         struct inode *inode = mapping->host;
         pgoff_t end_index = wbc->range_end >> PAGE_SHIFT;
-       struct dax_device *dax_dev;
         void *entry;
         int ret = 0;
         unsigned int scanned = 0;
@@ -953,10 +952,6 @@ int dax_writeback_mapping_range(struct address_space *mapping,
         if (!mapping->nrexceptional || wbc->sync_mode != WB_SYNC_ALL)
                 return 0;
  
-       dax_dev = dax_get_by_host(bdev->bd_disk->disk_name);
-       if (!dax_dev)
-               return -EIO;
-
         trace_dax_writeback_range(inode, xas.xa_index, end_index);
  
         tag_pages_for_writeback(mapping, xas.xa_index, end_index);
@@ -977,7 +972,6 @@ int dax_writeback_mapping_range(struct address_space *mapping,
                 xas_lock_irq(&xas);
         }
         xas_unlock_irq(&xas);
-       put_dax(dax_dev);
         trace_dax_writeback_range_done(inode, xas.xa_index, end_index);
         return ret;
  }
@@ -1207,6 +1201,9 @@ dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter,
                 lockdep_assert_held(&inode->i_rwsem);
         }
  
+       if (iocb->ki_flags & IOCB_NOWAIT)
+               flags |= IOMAP_NOWAIT;
+
         while (iov_iter_count(iter)) {
                 ret = iomap_apply(inode, pos, iov_iter_count(iter), flags, ops,
                                 iter, dax_iomap_actor);
diff --git a/fs/ecryptfs/crypto.c b/fs/ecryptfs/crypto.c

index db1ef144c63a52327c070f7efa2ce44acce6ba6d..2c449aed1b9209e708c049accbab3050604eb100 100644 (file)
--- a/fs/ecryptfs/crypto.c
+++ b/fs/ecryptfs/crypto.c
@@ -311,8 +311,10 @@ static int crypt_scatterlist(struct ecryptfs_crypt_stat *crypt_stat,
         struct extent_crypt_result ecr;
         int rc = 0;
  
-       BUG_ON(!crypt_stat || !crypt_stat->tfm
-              || !(crypt_stat->flags & ECRYPTFS_STRUCT_INITIALIZED));
+       if (!crypt_stat || !crypt_stat->tfm
+              || !(crypt_stat->flags & ECRYPTFS_STRUCT_INITIALIZED))
+               return -EINVAL;
+
         if (unlikely(ecryptfs_verbosity > 0)) {
                 ecryptfs_printk(KERN_DEBUG, "Key size [%zd]; key:\n",
                                 crypt_stat->key_size);
diff --git a/fs/ecryptfs/ecryptfs_kernel.h b/fs/ecryptfs/ecryptfs_kernel.h

index 1c1a56be7ea2feb35314419f79aa08ceb12dbd62..e6ac78c62ca4929efd9c22465659a7a84b6ec3ee 100644 (file)
--- a/fs/ecryptfs/ecryptfs_kernel.h
+++ b/fs/ecryptfs/ecryptfs_kernel.h
@@ -8,7 +8,7 @@
   * Copyright (C) 2004-2008 International Business Machines Corp.
   *   Author(s): Michael A. Halcrow <mahalcro@us.ibm.com>
   *              Trevor S. Highland <trevor.highland@gmail.com>
- *              Tyler Hicks <tyhicks@ou.edu>
+ *              Tyler Hicks <code@tyhicks.com>
   */
  
  #ifndef ECRYPTFS_KERNEL_H
diff --git a/fs/ecryptfs/keystore.c b/fs/ecryptfs/keystore.c

index 7d326aa0308e4da493a97f420bca1ce84cb9f272..af3eb02bbca1db90ebbd1d506fa77b3f2afbf79f 100644 (file)
--- a/fs/ecryptfs/keystore.c
+++ b/fs/ecryptfs/keystore.c
@@ -1304,7 +1304,7 @@ parse_tag_1_packet(struct ecryptfs_crypt_stat *crypt_stat,
                 printk(KERN_WARNING "Tag 1 packet contains key larger "
                        "than ECRYPTFS_MAX_ENCRYPTED_KEY_BYTES\n");
                 rc = -EINVAL;
-               goto out;
+               goto out_free;
         }
         memcpy((*new_auth_tok)->session_key.encrypted_key,
                &data[(*packet_size)], (body_size - (ECRYPTFS_SIG_SIZE + 2)));
diff --git a/fs/ecryptfs/main.c b/fs/ecryptfs/main.c

index b8a7ce379ffe67f8ab12923a63a2bd1ceb93d156..e63259fdef2882ab5f6a9217dd9bfea95c9a7928 100644 (file)
--- a/fs/ecryptfs/main.c
+++ b/fs/ecryptfs/main.c
@@ -7,7 +7,7 @@
   * Copyright (C) 2004-2007 International Business Machines Corp.
   *   Author(s): Michael A. Halcrow <mahalcro@us.ibm.com>
   *              Michael C. Thompson <mcthomps@us.ibm.com>
- *              Tyler Hicks <tyhicks@ou.edu>
+ *              Tyler Hicks <code@tyhicks.com>
   */
  
  #include <linux/dcache.h>
diff --git a/fs/ecryptfs/messaging.c b/fs/ecryptfs/messaging.c

index d668e60b85b556dd27b08a345a222688b795257c..8646ba76def3416b141f8fe4f54a5812302e7886 100644 (file)
--- a/fs/ecryptfs/messaging.c
+++ b/fs/ecryptfs/messaging.c
@@ -4,7 +4,7 @@
   *
   * Copyright (C) 2004-2008 International Business Machines Corp.
   *   Author(s): Michael A. Halcrow <mhalcrow@us.ibm.com>
- *             Tyler Hicks <tyhicks@ou.edu>
+ *             Tyler Hicks <code@tyhicks.com>
   */
  #include <linux/sched.h>
  #include <linux/slab.h>
@@ -379,6 +379,7 @@ int __init ecryptfs_init_messaging(void)
                                         * ecryptfs_message_buf_len),
                                        GFP_KERNEL);
         if (!ecryptfs_msg_ctx_arr) {
+               kfree(ecryptfs_daemon_hash);
                 rc = -ENOMEM;
                 goto out;
         }
diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c

index 119667e658904aa877b462259ce4bd472f35f054..c885cf7d724b4830d0e952acbc91f0db1b3b1f63 100644 (file)
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -960,8 +960,9 @@ ext2_writepages(struct address_space *mapping, struct writeback_control *wbc)
  static int
  ext2_dax_writepages(struct address_space *mapping, struct writeback_control *wbc)
  {
-       return dax_writeback_mapping_range(mapping,
-                       mapping->host->i_sb->s_bdev, wbc);
+       struct ext2_sb_info *sbi = EXT2_SB(mapping->host->i_sb);
+
+       return dax_writeback_mapping_range(mapping, sbi->s_daxdev, wbc);
  }
  
  const struct address_space_operations ext2_aops = {
diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c

index 5f993a411251fe54635a468aa89e04ff082c3587..8fd0b3cdab4cdd55e7ad0d374064da49bb2d421e 100644 (file)
--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -270,6 +270,7 @@ struct ext4_group_desc * ext4_get_group_desc(struct super_block *sb,
         ext4_group_t ngroups = ext4_get_groups_count(sb);
         struct ext4_group_desc *desc;
         struct ext4_sb_info *sbi = EXT4_SB(sb);
+       struct buffer_head *bh_p;
  
         if (block_group >= ngroups) {
                 ext4_error(sb, "block_group >= groups_count - block_group = %u,"
@@ -280,7 +281,14 @@ struct ext4_group_desc * ext4_get_group_desc(struct super_block *sb,
  
         group_desc = block_group >> EXT4_DESC_PER_BLOCK_BITS(sb);
         offset = block_group & (EXT4_DESC_PER_BLOCK(sb) - 1);
-       if (!sbi->s_group_desc[group_desc]) {
+       bh_p = sbi_array_rcu_deref(sbi, s_group_desc, group_desc);
+       /*
+        * sbi_array_rcu_deref returns with rcu unlocked, this is ok since
+        * the pointer being dereferenced won't be dereferenced again. By
+        * looking at the usage in add_new_gdb() the value isn't modified,
+        * just the pointer, and so it remains valid.
+        */
+       if (!bh_p) {
                 ext4_error(sb, "Group descriptor not loaded - "
                            "block_group = %u, group_desc = %u, desc = %u",
                            block_group, group_desc, offset);
@@ -288,10 +296,10 @@ struct ext4_group_desc * ext4_get_group_desc(struct super_block *sb,
         }
  
         desc = (struct ext4_group_desc *)(
-               (__u8 *)sbi->s_group_desc[group_desc]->b_data +
+               (__u8 *)bh_p->b_data +
                 offset * EXT4_DESC_SIZE(sb));
         if (bh)
-               *bh = sbi->s_group_desc[group_desc];
+               *bh = bh_p;
         return desc;
  }
  
diff --git a/fs/ext4/block_validity.c b/fs/ext4/block_validity.c

index 1ee04e76bbe0404b95d3a0be205266ff5fdcf4f6..0a734ffb4310616afbfc57940b6c989288d6f0fe 100644 (file)
--- a/fs/ext4/block_validity.c
+++ b/fs/ext4/block_validity.c
@@ -207,6 +207,7 @@ static int ext4_protect_reserved_inode(struct super_block *sb,
                 return PTR_ERR(inode);
         num = (inode->i_size + sb->s_blocksize - 1) >> sb->s_blocksize_bits;
         while (i < num) {
+               cond_resched();
                 map.m_lblk = i;
                 map.m_len = num - i;
                 n = ext4_map_blocks(NULL, inode, &map, 0);
diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c

index 1f340743c9a890152bae85c701a8650d7b76d197..9aa1f75409b02c569ab46a2739790c64f7450123 100644 (file)
--- a/fs/ext4/dir.c
+++ b/fs/ext4/dir.c
@@ -129,12 +129,14 @@ static int ext4_readdir(struct file *file, struct dir_context *ctx)
                 if (err != ERR_BAD_DX_DIR) {
                         return err;
                 }
-               /*
-                * We don't set the inode dirty flag since it's not
-                * critical that it get flushed back to the disk.
-                */
-               ext4_clear_inode_flag(file_inode(file),
-                                     EXT4_INODE_INDEX);
+               /* Can we just clear INDEX flag to ignore htree information? */
+               if (!ext4_has_metadata_csum(sb)) {
+                       /*
+                        * We don't set the inode dirty flag since it's not
+                        * critical that it gets flushed back to the disk.
+                        */
+                       ext4_clear_inode_flag(inode, EXT4_INODE_INDEX);
+               }
         }
  
         if (ext4_has_inline_data(inode)) {
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h

index 9a2ee2428ecc0e5163ef5537a49323fd1dc30d91..61b37a052052b58b577ad617086dbbcdb90b48f3 100644 (file)
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1400,7 +1400,7 @@ struct ext4_sb_info {
         loff_t s_bitmap_maxbytes;       /* max bytes for bitmap files */
         struct buffer_head * s_sbh;     /* Buffer containing the super block */
         struct ext4_super_block *s_es;  /* Pointer to the super block in the buffer */
-       struct buffer_head **s_group_desc;
+       struct buffer_head * __rcu *s_group_desc;
         unsigned int s_mount_opt;
         unsigned int s_mount_opt2;
         unsigned int s_mount_flags;
@@ -1462,7 +1462,7 @@ struct ext4_sb_info {
  #endif
  
         /* for buddy allocator */
-       struct ext4_group_info ***s_group_info;
+       struct ext4_group_info ** __rcu *s_group_info;
         struct inode *s_buddy_cache;
         spinlock_t s_md_lock;
         unsigned short *s_mb_offsets;
@@ -1512,7 +1512,7 @@ struct ext4_sb_info {
         unsigned int s_extent_max_zeroout_kb;
  
         unsigned int s_log_groups_per_flex;
-       struct flex_groups *s_flex_groups;
+       struct flex_groups * __rcu *s_flex_groups;
         ext4_group_t s_flex_groups_allocated;
  
         /* workqueue for reserved extent conversions (buffered io) */
@@ -1552,8 +1552,11 @@ struct ext4_sb_info {
         struct ratelimit_state s_warning_ratelimit_state;
         struct ratelimit_state s_msg_ratelimit_state;
  
-       /* Barrier between changing inodes' journal flags and writepages ops. */
-       struct percpu_rw_semaphore s_journal_flag_rwsem;
+       /*
+        * Barrier between writepages ops and changing any inode's JOURNAL_DATA
+        * or EXTENTS flag.
+        */
+       struct percpu_rw_semaphore s_writepages_rwsem;
         struct dax_device *s_daxdev;
  #ifdef CONFIG_EXT4_DEBUG
         unsigned long s_simulate_fail;
@@ -1576,6 +1579,23 @@ static inline int ext4_valid_inum(struct super_block *sb, unsigned long ino)
                  ino <= le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count));
  }
  
+/*
+ * Returns: sbi->field[index]
+ * Used to access an array element from the following sbi fields which require
+ * rcu protection to avoid dereferencing an invalid pointer due to reassignment
+ * - s_group_desc
+ * - s_group_info
+ * - s_flex_group
+ */
+#define sbi_array_rcu_deref(sbi, field, index)                            \
+({                                                                        \
+       typeof(*((sbi)->field)) _v;                                        \
+       rcu_read_lock();                                                   \
+       _v = ((typeof(_v)*)rcu_dereference((sbi)->field))[index];          \
+       rcu_read_unlock();                                                 \
+       _v;                                                                \
+})
+
  /*
   * Simulate_fail codes
   */
@@ -2544,8 +2564,11 @@ void ext4_insert_dentry(struct inode *inode,
                         struct ext4_filename *fname);
  static inline void ext4_update_dx_flag(struct inode *inode)
  {
-       if (!ext4_has_feature_dir_index(inode->i_sb))
+       if (!ext4_has_feature_dir_index(inode->i_sb)) {
+               /* ext4_iget() should have caught this... */
+               WARN_ON_ONCE(ext4_has_feature_metadata_csum(inode->i_sb));
                 ext4_clear_inode_flag(inode, EXT4_INODE_INDEX);
+       }
  }
  static const unsigned char ext4_filetype_table[] = {
         DT_UNKNOWN, DT_REG, DT_DIR, DT_CHR, DT_BLK, DT_FIFO, DT_SOCK, DT_LNK
@@ -2727,6 +2750,7 @@ extern int ext4_generic_delete_entry(handle_t *handle,
  extern bool ext4_empty_dir(struct inode *inode);
  
  /* resize.c */
+extern void ext4_kvfree_array_rcu(void *to_free);
  extern int ext4_group_add(struct super_block *sb,
                                 struct ext4_new_group_data *input);
  extern int ext4_group_extend(struct super_block *sb,
@@ -2973,13 +2997,13 @@ static inline
  struct ext4_group_info *ext4_get_group_info(struct super_block *sb,
                                             ext4_group_t group)
  {
-        struct ext4_group_info ***grp_info;
+        struct ext4_group_info **grp_info;
          long indexv, indexh;
          BUG_ON(group >= EXT4_SB(sb)->s_groups_count);
-        grp_info = EXT4_SB(sb)->s_group_info;
          indexv = group >> (EXT4_DESC_PER_BLOCK_BITS(sb));
          indexh = group & ((EXT4_DESC_PER_BLOCK(sb)) - 1);
-        return grp_info[indexv][indexh];
+        grp_info = sbi_array_rcu_deref(EXT4_SB(sb), s_group_info, indexv);
+        return grp_info[indexh];
  }
  
  /*
@@ -3029,7 +3053,7 @@ static inline void ext4_update_i_disksize(struct inode *inode, loff_t newsize)
                      !inode_is_locked(inode));
         down_write(&EXT4_I(inode)->i_data_sem);
         if (newsize > EXT4_I(inode)->i_disksize)
-               EXT4_I(inode)->i_disksize = newsize;
+               WRITE_ONCE(EXT4_I(inode)->i_disksize, newsize);
         up_write(&EXT4_I(inode)->i_data_sem);
  }
  
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c

index c66e8f9451a266669bc70a40b40b3ac1af5849d6..f95ee99091e4c55a827d4cdad01bcc89ef8b3e89 100644 (file)
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -328,11 +328,13 @@ void ext4_free_inode(handle_t *handle, struct inode *inode)
  
         percpu_counter_inc(&sbi->s_freeinodes_counter);
         if (sbi->s_log_groups_per_flex) {
-               ext4_group_t f = ext4_flex_group(sbi, block_group);
+               struct flex_groups *fg;
  
-               atomic_inc(&sbi->s_flex_groups[f].free_inodes);
+               fg = sbi_array_rcu_deref(sbi, s_flex_groups,
+                                        ext4_flex_group(sbi, block_group));
+               atomic_inc(&fg->free_inodes);
                 if (is_directory)
-                       atomic_dec(&sbi->s_flex_groups[f].used_dirs);
+                       atomic_dec(&fg->used_dirs);
         }
         BUFFER_TRACE(bh2, "call ext4_handle_dirty_metadata");
         fatal = ext4_handle_dirty_metadata(handle, NULL, bh2);
@@ -368,12 +370,13 @@ static void get_orlov_stats(struct super_block *sb, ext4_group_t g,
                             int flex_size, struct orlov_stats *stats)
  {
         struct ext4_group_desc *desc;
-       struct flex_groups *flex_group = EXT4_SB(sb)->s_flex_groups;
  
         if (flex_size > 1) {
-               stats->free_inodes = atomic_read(&flex_group[g].free_inodes);
-               stats->free_clusters = atomic64_read(&flex_group[g].free_clusters);
-               stats->used_dirs = atomic_read(&flex_group[g].used_dirs);
+               struct flex_groups *fg = sbi_array_rcu_deref(EXT4_SB(sb),
+                                                            s_flex_groups, g);
+               stats->free_inodes = atomic_read(&fg->free_inodes);
+               stats->free_clusters = atomic64_read(&fg->free_clusters);
+               stats->used_dirs = atomic_read(&fg->used_dirs);
                 return;
         }
  
@@ -1054,7 +1057,8 @@ got:
                 if (sbi->s_log_groups_per_flex) {
                         ext4_group_t f = ext4_flex_group(sbi, group);
  
-                       atomic_inc(&sbi->s_flex_groups[f].used_dirs);
+                       atomic_inc(&sbi_array_rcu_deref(sbi, s_flex_groups,
+                                                       f)->used_dirs);
                 }
         }
         if (ext4_has_group_desc_csum(sb)) {
@@ -1077,7 +1081,8 @@ got:
  
         if (sbi->s_log_groups_per_flex) {
                 flex_group = ext4_flex_group(sbi, group);
-               atomic_dec(&sbi->s_flex_groups[flex_group].free_inodes);
+               atomic_dec(&sbi_array_rcu_deref(sbi, s_flex_groups,
+                                               flex_group)->free_inodes);
         }
  
         inode->i_ino = ino + group * EXT4_INODES_PER_GROUP(sb);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c

index 3313168b680f1375376d71f63d0c8f1d95e87739..fa0ff78dc033f8cda2aa123e13c177ed7728dceb 100644 (file)
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2465,7 +2465,7 @@ update_disksize:
          * truncate are avoided by checking i_size under i_data_sem.
          */
         disksize = ((loff_t)mpd->first_page) << PAGE_SHIFT;
-       if (disksize > EXT4_I(inode)->i_disksize) {
+       if (disksize > READ_ONCE(EXT4_I(inode)->i_disksize)) {
                 int err2;
                 loff_t i_size;
  
@@ -2628,7 +2628,7 @@ static int ext4_writepages(struct address_space *mapping,
         if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb))))
                 return -EIO;
  
-       percpu_down_read(&sbi->s_journal_flag_rwsem);
+       percpu_down_read(&sbi->s_writepages_rwsem);
         trace_ext4_writepages(inode, wbc);
  
         /*
@@ -2849,7 +2849,7 @@ unplug:
  out_writepages:
         trace_ext4_writepages_result(inode, wbc, ret,
                                      nr_to_write - wbc->nr_to_write);
-       percpu_up_read(&sbi->s_journal_flag_rwsem);
+       percpu_up_read(&sbi->s_writepages_rwsem);
         return ret;
  }
  
@@ -2864,13 +2864,13 @@ static int ext4_dax_writepages(struct address_space *mapping,
         if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb))))
                 return -EIO;
  
-       percpu_down_read(&sbi->s_journal_flag_rwsem);
+       percpu_down_read(&sbi->s_writepages_rwsem);
         trace_ext4_writepages(inode, wbc);
  
-       ret = dax_writeback_mapping_range(mapping, inode->i_sb->s_bdev, wbc);
+       ret = dax_writeback_mapping_range(mapping, sbi->s_daxdev, wbc);
         trace_ext4_writepages_result(inode, wbc, ret,
                                      nr_to_write - wbc->nr_to_write);
-       percpu_up_read(&sbi->s_journal_flag_rwsem);
+       percpu_up_read(&sbi->s_writepages_rwsem);
         return ret;
  }
  
@@ -4644,6 +4644,18 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
                 ret = -EFSCORRUPTED;
                 goto bad_inode;
         }
+       /*
+        * If dir_index is not enabled but there's dir with INDEX flag set,
+        * we'd normally treat htree data as empty space. But with metadata
+        * checksumming that corrupts checksums so forbid that.
+        */
+       if (!ext4_has_feature_dir_index(sb) && ext4_has_metadata_csum(sb) &&
+           ext4_test_inode_flag(inode, EXT4_INODE_INDEX)) {
+               ext4_error_inode(inode, function, line, 0,
+                        "iget: Dir with htree data on filesystem without dir_index feature.");
+               ret = -EFSCORRUPTED;
+               goto bad_inode;
+       }
         ei->i_disksize = inode->i_size;
  #ifdef CONFIG_QUOTA
         ei->i_reserved_quota = 0;
@@ -5849,7 +5861,7 @@ int ext4_change_inode_journal_flag(struct inode *inode, int val)
                 }
         }
  
-       percpu_down_write(&sbi->s_journal_flag_rwsem);
+       percpu_down_write(&sbi->s_writepages_rwsem);
         jbd2_journal_lock_updates(journal);
  
         /*
@@ -5866,7 +5878,7 @@ int ext4_change_inode_journal_flag(struct inode *inode, int val)
                 err = jbd2_journal_flush(journal);
                 if (err < 0) {
                         jbd2_journal_unlock_updates(journal);
-                       percpu_up_write(&sbi->s_journal_flag_rwsem);
+                       percpu_up_write(&sbi->s_writepages_rwsem);
                         return err;
                 }
                 ext4_clear_inode_flag(inode, EXT4_INODE_JOURNAL_DATA);
@@ -5874,7 +5886,7 @@ int ext4_change_inode_journal_flag(struct inode *inode, int val)
         ext4_set_aops(inode);
  
         jbd2_journal_unlock_updates(journal);
-       percpu_up_write(&sbi->s_journal_flag_rwsem);
+       percpu_up_write(&sbi->s_writepages_rwsem);
  
         if (val)
                 up_write(&EXT4_I(inode)->i_mmap_sem);
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c

index f64838187559f25b859ba5acce4d0186a2c4415b..51a78eb65f3cf64744dd82a7ed0735f350a4459f 100644 (file)
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2356,7 +2356,7 @@ int ext4_mb_alloc_groupinfo(struct super_block *sb, ext4_group_t ngroups)
  {
         struct ext4_sb_info *sbi = EXT4_SB(sb);
         unsigned size;
-       struct ext4_group_info ***new_groupinfo;
+       struct ext4_group_info ***old_groupinfo, ***new_groupinfo;
  
         size = (ngroups + EXT4_DESC_PER_BLOCK(sb) - 1) >>
                 EXT4_DESC_PER_BLOCK_BITS(sb);
@@ -2369,13 +2369,16 @@ int ext4_mb_alloc_groupinfo(struct super_block *sb, ext4_group_t ngroups)
                 ext4_msg(sb, KERN_ERR, "can't allocate buddy meta group");
                 return -ENOMEM;
         }
-       if (sbi->s_group_info) {
-               memcpy(new_groupinfo, sbi->s_group_info,
+       rcu_read_lock();
+       old_groupinfo = rcu_dereference(sbi->s_group_info);
+       if (old_groupinfo)
+               memcpy(new_groupinfo, old_groupinfo,
                        sbi->s_group_info_size * sizeof(*sbi->s_group_info));
-               kvfree(sbi->s_group_info);
-       }
-       sbi->s_group_info = new_groupinfo;
+       rcu_read_unlock();
+       rcu_assign_pointer(sbi->s_group_info, new_groupinfo);
         sbi->s_group_info_size = size / sizeof(*sbi->s_group_info);
+       if (old_groupinfo)
+               ext4_kvfree_array_rcu(old_groupinfo);
         ext4_debug("allocated s_groupinfo array for %d meta_bg's\n", 
                    sbi->s_group_info_size);
         return 0;
@@ -2387,6 +2390,7 @@ int ext4_mb_add_groupinfo(struct super_block *sb, ext4_group_t group,
  {
         int i;
         int metalen = 0;
+       int idx = group >> EXT4_DESC_PER_BLOCK_BITS(sb);
         struct ext4_sb_info *sbi = EXT4_SB(sb);
         struct ext4_group_info **meta_group_info;
         struct kmem_cache *cachep = get_groupinfo_cache(sb->s_blocksize_bits);
@@ -2405,12 +2409,12 @@ int ext4_mb_add_groupinfo(struct super_block *sb, ext4_group_t group,
                                  "for a buddy group");
                         goto exit_meta_group_info;
                 }
-               sbi->s_group_info[group >> EXT4_DESC_PER_BLOCK_BITS(sb)] =
-                       meta_group_info;
+               rcu_read_lock();
+               rcu_dereference(sbi->s_group_info)[idx] = meta_group_info;
+               rcu_read_unlock();
         }
  
-       meta_group_info =
-               sbi->s_group_info[group >> EXT4_DESC_PER_BLOCK_BITS(sb)];
+       meta_group_info = sbi_array_rcu_deref(sbi, s_group_info, idx);
         i = group & (EXT4_DESC_PER_BLOCK(sb) - 1);
  
         meta_group_info[i] = kmem_cache_zalloc(cachep, GFP_NOFS);
@@ -2458,8 +2462,13 @@ int ext4_mb_add_groupinfo(struct super_block *sb, ext4_group_t group,
  exit_group_info:
         /* If a meta_group_info table has been allocated, release it now */
         if (group % EXT4_DESC_PER_BLOCK(sb) == 0) {
-               kfree(sbi->s_group_info[group >> EXT4_DESC_PER_BLOCK_BITS(sb)]);
-               sbi->s_group_info[group >> EXT4_DESC_PER_BLOCK_BITS(sb)] = NULL;
+               struct ext4_group_info ***group_info;
+
+               rcu_read_lock();
+               group_info = rcu_dereference(sbi->s_group_info);
+               kfree(group_info[idx]);
+               group_info[idx] = NULL;
+               rcu_read_unlock();
         }
  exit_meta_group_info:
         return -ENOMEM;
@@ -2472,6 +2481,7 @@ static int ext4_mb_init_backend(struct super_block *sb)
         struct ext4_sb_info *sbi = EXT4_SB(sb);
         int err;
         struct ext4_group_desc *desc;
+       struct ext4_group_info ***group_info;
         struct kmem_cache *cachep;
  
         err = ext4_mb_alloc_groupinfo(sb, ngroups);
@@ -2507,11 +2517,16 @@ err_freebuddy:
         while (i-- > 0)
                 kmem_cache_free(cachep, ext4_get_group_info(sb, i));
         i = sbi->s_group_info_size;
+       rcu_read_lock();
+       group_info = rcu_dereference(sbi->s_group_info);
         while (i-- > 0)
-               kfree(sbi->s_group_info[i]);
+               kfree(group_info[i]);
+       rcu_read_unlock();
         iput(sbi->s_buddy_cache);
  err_freesgi:
-       kvfree(sbi->s_group_info);
+       rcu_read_lock();
+       kvfree(rcu_dereference(sbi->s_group_info));
+       rcu_read_unlock();
         return -ENOMEM;
  }
  
@@ -2700,7 +2715,7 @@ int ext4_mb_release(struct super_block *sb)
         ext4_group_t ngroups = ext4_get_groups_count(sb);
         ext4_group_t i;
         int num_meta_group_infos;
-       struct ext4_group_info *grinfo;
+       struct ext4_group_info *grinfo, ***group_info;
         struct ext4_sb_info *sbi = EXT4_SB(sb);
         struct kmem_cache *cachep = get_groupinfo_cache(sb->s_blocksize_bits);
  
@@ -2719,9 +2734,12 @@ int ext4_mb_release(struct super_block *sb)
                 num_meta_group_infos = (ngroups +
                                 EXT4_DESC_PER_BLOCK(sb) - 1) >>
                         EXT4_DESC_PER_BLOCK_BITS(sb);
+               rcu_read_lock();
+               group_info = rcu_dereference(sbi->s_group_info);
                 for (i = 0; i < num_meta_group_infos; i++)
-                       kfree(sbi->s_group_info[i]);
-               kvfree(sbi->s_group_info);
+                       kfree(group_info[i]);
+               kvfree(group_info);
+               rcu_read_unlock();
         }
         kfree(sbi->s_mb_offsets);
         kfree(sbi->s_mb_maxs);
@@ -3020,7 +3038,8 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
                 ext4_group_t flex_group = ext4_flex_group(sbi,
                                                           ac->ac_b_ex.fe_group);
                 atomic64_sub(ac->ac_b_ex.fe_len,
-                            &sbi->s_flex_groups[flex_group].free_clusters);
+                            &sbi_array_rcu_deref(sbi, s_flex_groups,
+                                                 flex_group)->free_clusters);
         }
  
         err = ext4_handle_dirty_metadata(handle, NULL, bitmap_bh);
@@ -4918,7 +4937,8 @@ do_more:
         if (sbi->s_log_groups_per_flex) {
                 ext4_group_t flex_group = ext4_flex_group(sbi, block_group);
                 atomic64_add(count_clusters,
-                            &sbi->s_flex_groups[flex_group].free_clusters);
+                            &sbi_array_rcu_deref(sbi, s_flex_groups,
+                                                 flex_group)->free_clusters);
         }
  
         /*
@@ -5075,7 +5095,8 @@ int ext4_group_add_blocks(handle_t *handle, struct super_block *sb,
         if (sbi->s_log_groups_per_flex) {
                 ext4_group_t flex_group = ext4_flex_group(sbi, block_group);
                 atomic64_add(clusters_freed,
-                            &sbi->s_flex_groups[flex_group].free_clusters);
+                            &sbi_array_rcu_deref(sbi, s_flex_groups,
+                                                 flex_group)->free_clusters);
         }
  
         ext4_mb_unload_buddy(&e4b);
diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c

index 89725fa425732e4933a95b60200b33f40158537d..fb6520f37135509791c35b782ba14dfe328cfa49 100644 (file)
--- a/fs/ext4/migrate.c
+++ b/fs/ext4/migrate.c
@@ -407,6 +407,7 @@ static int free_ext_block(handle_t *handle, struct inode *inode)
  
  int ext4_ext_migrate(struct inode *inode)
  {
+       struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
         handle_t *handle;
         int retval = 0, i;
         __le32 *i_data;
@@ -431,6 +432,8 @@ int ext4_ext_migrate(struct inode *inode)
                  */
                 return retval;
  
+       percpu_down_write(&sbi->s_writepages_rwsem);
+
         /*
          * Worst case we can touch the allocation bitmaps, a bgd
          * block, and a block to link in the orphan list.  We do need
@@ -441,7 +444,7 @@ int ext4_ext_migrate(struct inode *inode)
  
         if (IS_ERR(handle)) {
                 retval = PTR_ERR(handle);
-               return retval;
+               goto out_unlock;
         }
         goal = (((inode->i_ino - 1) / EXT4_INODES_PER_GROUP(inode->i_sb)) *
                 EXT4_INODES_PER_GROUP(inode->i_sb)) + 1;
@@ -452,7 +455,7 @@ int ext4_ext_migrate(struct inode *inode)
         if (IS_ERR(tmp_inode)) {
                 retval = PTR_ERR(tmp_inode);
                 ext4_journal_stop(handle);
-               return retval;
+               goto out_unlock;
         }
         i_size_write(tmp_inode, i_size_read(inode));
         /*
@@ -494,7 +497,7 @@ int ext4_ext_migrate(struct inode *inode)
                  */
                 ext4_orphan_del(NULL, tmp_inode);
                 retval = PTR_ERR(handle);
-               goto out;
+               goto out_tmp_inode;
         }
  
         ei = EXT4_I(inode);
@@ -576,10 +579,11 @@ err_out:
         ext4_ext_tree_init(handle, tmp_inode);
  out_stop:
         ext4_journal_stop(handle);
-out:
+out_tmp_inode:
         unlock_new_inode(tmp_inode);
         iput(tmp_inode);
-
+out_unlock:
+       percpu_up_write(&sbi->s_writepages_rwsem);
         return retval;
  }
  
@@ -589,7 +593,8 @@ out:
  int ext4_ind_migrate(struct inode *inode)
  {
         struct ext4_extent_header       *eh;
-       struct ext4_super_block         *es = EXT4_SB(inode->i_sb)->s_es;
+       struct ext4_sb_info             *sbi = EXT4_SB(inode->i_sb);
+       struct ext4_super_block         *es = sbi->s_es;
         struct ext4_inode_info          *ei = EXT4_I(inode);
         struct ext4_extent              *ex;
         unsigned int                    i, len;
@@ -613,9 +618,13 @@ int ext4_ind_migrate(struct inode *inode)
         if (test_opt(inode->i_sb, DELALLOC))
                 ext4_alloc_da_blocks(inode);
  
+       percpu_down_write(&sbi->s_writepages_rwsem);
+
         handle = ext4_journal_start(inode, EXT4_HT_MIGRATE, 1);
-       if (IS_ERR(handle))
-               return PTR_ERR(handle);
+       if (IS_ERR(handle)) {
+               ret = PTR_ERR(handle);
+               goto out_unlock;
+       }
  
         down_write(&EXT4_I(inode)->i_data_sem);
         ret = ext4_ext_check_inode(inode);
@@ -650,5 +659,7 @@ int ext4_ind_migrate(struct inode *inode)
  errout:
         ext4_journal_stop(handle);
         up_write(&EXT4_I(inode)->i_data_sem);
+out_unlock:
+       percpu_up_write(&sbi->s_writepages_rwsem);
         return ret;
  }
diff --git a/fs/ext4/mmp.c b/fs/ext4/mmp.c

index 1c44b1a320015d5eda08adffddd4aba47630817c..87f7551c5132ebeee4fc465fb8161af038329b80 100644 (file)
--- a/fs/ext4/mmp.c
+++ b/fs/ext4/mmp.c
@@ -120,10 +120,10 @@ void __dump_mmp_msg(struct super_block *sb, struct mmp_struct *mmp,
  {
         __ext4_warning(sb, function, line, "%s", msg);
         __ext4_warning(sb, function, line,
-                      "MMP failure info: last update time: %llu, last update "
-                      "node: %s, last update device: %s",
-                      (long long unsigned int) le64_to_cpu(mmp->mmp_time),
-                      mmp->mmp_nodename, mmp->mmp_bdevname);
+                      "MMP failure info: last update time: %llu, last update node: %.*s, last update device: %.*s",
+                      (unsigned long long)le64_to_cpu(mmp->mmp_time),
+                      (int)sizeof(mmp->mmp_nodename), mmp->mmp_nodename,
+                      (int)sizeof(mmp->mmp_bdevname), mmp->mmp_bdevname);
  }
  
  /*
@@ -154,6 +154,7 @@ static int kmmpd(void *data)
         mmp_check_interval = max(EXT4_MMP_CHECK_MULT * mmp_update_interval,
                                  EXT4_MMP_MIN_CHECK_INTERVAL);
         mmp->mmp_check_interval = cpu_to_le16(mmp_check_interval);
+       BUILD_BUG_ON(sizeof(mmp->mmp_bdevname) < BDEVNAME_SIZE);
         bdevname(bh->b_bdev, mmp->mmp_bdevname);
  
         memcpy(mmp->mmp_nodename, init_utsname()->nodename,
@@ -379,7 +380,8 @@ skip:
         /*
          * Start a kernel thread to update the MMP block periodically.
          */
-       EXT4_SB(sb)->s_mmp_tsk = kthread_run(kmmpd, mmpd_data, "kmmpd-%s",
+       EXT4_SB(sb)->s_mmp_tsk = kthread_run(kmmpd, mmpd_data, "kmmpd-%.*s",
+                                            (int)sizeof(mmp->mmp_bdevname),
                                              bdevname(bh->b_bdev,
                                                       mmp->mmp_bdevname));
         if (IS_ERR(EXT4_SB(sb)->s_mmp_tsk)) {
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c

index 129d2ebae00d05f198fe7c1d23a432dcb1ed710a..b05ea72f38fd1981dce81b4a8cc37ec8fcad7108 100644 (file)
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1511,6 +1511,7 @@ restart:
                 /*
                  * We deal with the read-ahead logic here.
                  */
+               cond_resched();
                 if (ra_ptr >= ra_max) {
                         /* Refill the readahead buffer */
                         ra_ptr = 0;
@@ -2213,6 +2214,13 @@ static int ext4_add_entry(handle_t *handle, struct dentry *dentry,
                 retval = ext4_dx_add_entry(handle, &fname, dir, inode);
                 if (!retval || (retval != ERR_BAD_DX_DIR))
                         goto out;
+               /* Can we just ignore htree data? */
+               if (ext4_has_metadata_csum(sb)) {
+                       EXT4_ERROR_INODE(dir,
+                               "Directory has corrupted htree index.");
+                       retval = -EFSCORRUPTED;
+                       goto out;
+               }
                 ext4_clear_inode_flag(dir, EXT4_INODE_INDEX);
                 dx_fallback++;
                 ext4_mark_inode_dirty(handle, dir);
diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c

index 86a2500ed292f136b93f2ee75e71638567a72646..a50b51270ea9ad976c2f9da7c2cc97b8d2254eb3 100644 (file)
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -17,6 +17,33 @@
  
  #include "ext4_jbd2.h"
  
+struct ext4_rcu_ptr {
+       struct rcu_head rcu;
+       void *ptr;
+};
+
+static void ext4_rcu_ptr_callback(struct rcu_head *head)
+{
+       struct ext4_rcu_ptr *ptr;
+
+       ptr = container_of(head, struct ext4_rcu_ptr, rcu);
+       kvfree(ptr->ptr);
+       kfree(ptr);
+}
+
+void ext4_kvfree_array_rcu(void *to_free)
+{
+       struct ext4_rcu_ptr *ptr = kzalloc(sizeof(*ptr), GFP_KERNEL);
+
+       if (ptr) {
+               ptr->ptr = to_free;
+               call_rcu(&ptr->rcu, ext4_rcu_ptr_callback);
+               return;
+       }
+       synchronize_rcu();
+       kvfree(to_free);
+}
+
  int ext4_resize_begin(struct super_block *sb)
  {
         struct ext4_sb_info *sbi = EXT4_SB(sb);
@@ -542,8 +569,8 @@ static int setup_new_flex_group_blocks(struct super_block *sb,
                                 brelse(gdb);
                                 goto out;
                         }
-                       memcpy(gdb->b_data, sbi->s_group_desc[j]->b_data,
-                              gdb->b_size);
+                       memcpy(gdb->b_data, sbi_array_rcu_deref(sbi,
+                               s_group_desc, j)->b_data, gdb->b_size);
                         set_buffer_uptodate(gdb);
  
                         err = ext4_handle_dirty_metadata(handle, NULL, gdb);
@@ -860,13 +887,15 @@ static int add_new_gdb(handle_t *handle, struct inode *inode,
         }
         brelse(dind);
  
-       o_group_desc = EXT4_SB(sb)->s_group_desc;
+       rcu_read_lock();
+       o_group_desc = rcu_dereference(EXT4_SB(sb)->s_group_desc);
         memcpy(n_group_desc, o_group_desc,
                EXT4_SB(sb)->s_gdb_count * sizeof(struct buffer_head *));
+       rcu_read_unlock();
         n_group_desc[gdb_num] = gdb_bh;
-       EXT4_SB(sb)->s_group_desc = n_group_desc;
+       rcu_assign_pointer(EXT4_SB(sb)->s_group_desc, n_group_desc);
         EXT4_SB(sb)->s_gdb_count++;
-       kvfree(o_group_desc);
+       ext4_kvfree_array_rcu(o_group_desc);
  
         le16_add_cpu(&es->s_reserved_gdt_blocks, -1);
         err = ext4_handle_dirty_super(handle, sb);
@@ -909,9 +938,11 @@ static int add_new_gdb_meta_bg(struct super_block *sb,
                 return err;
         }
  
-       o_group_desc = EXT4_SB(sb)->s_group_desc;
+       rcu_read_lock();
+       o_group_desc = rcu_dereference(EXT4_SB(sb)->s_group_desc);
         memcpy(n_group_desc, o_group_desc,
                EXT4_SB(sb)->s_gdb_count * sizeof(struct buffer_head *));
+       rcu_read_unlock();
         n_group_desc[gdb_num] = gdb_bh;
  
         BUFFER_TRACE(gdb_bh, "get_write_access");
@@ -922,9 +953,9 @@ static int add_new_gdb_meta_bg(struct super_block *sb,
                 return err;
         }
  
-       EXT4_SB(sb)->s_group_desc = n_group_desc;
+       rcu_assign_pointer(EXT4_SB(sb)->s_group_desc, n_group_desc);
         EXT4_SB(sb)->s_gdb_count++;
-       kvfree(o_group_desc);
+       ext4_kvfree_array_rcu(o_group_desc);
         return err;
  }
  
@@ -1188,7 +1219,8 @@ static int ext4_add_new_descs(handle_t *handle, struct super_block *sb,
                  * use non-sparse filesystems anymore.  This is already checked above.
                  */
                 if (gdb_off) {
-                       gdb_bh = sbi->s_group_desc[gdb_num];
+                       gdb_bh = sbi_array_rcu_deref(sbi, s_group_desc,
+                                                    gdb_num);
                         BUFFER_TRACE(gdb_bh, "get_write_access");
                         err = ext4_journal_get_write_access(handle, gdb_bh);
  
@@ -1270,7 +1302,7 @@ static int ext4_setup_new_descs(handle_t *handle, struct super_block *sb,
                 /*
                  * get_write_access() has been called on gdb_bh by ext4_add_new_desc().
                  */
-               gdb_bh = sbi->s_group_desc[gdb_num];
+               gdb_bh = sbi_array_rcu_deref(sbi, s_group_desc, gdb_num);
                 /* Update group descriptor block for new group */
                 gdp = (struct ext4_group_desc *)(gdb_bh->b_data +
                                                  gdb_off * EXT4_DESC_SIZE(sb));
@@ -1398,11 +1430,14 @@ static void ext4_update_super(struct super_block *sb,
                    percpu_counter_read(&sbi->s_freeclusters_counter));
         if (ext4_has_feature_flex_bg(sb) && sbi->s_log_groups_per_flex) {
                 ext4_group_t flex_group;
+               struct flex_groups *fg;
+
                 flex_group = ext4_flex_group(sbi, group_data[0].group);
+               fg = sbi_array_rcu_deref(sbi, s_flex_groups, flex_group);
                 atomic64_add(EXT4_NUM_B2C(sbi, free_blocks),
-                            &sbi->s_flex_groups[flex_group].free_clusters);
+                            &fg->free_clusters);
                 atomic_add(EXT4_INODES_PER_GROUP(sb) * flex_gd->count,
-                          &sbi->s_flex_groups[flex_group].free_inodes);
+                          &fg->free_inodes);
         }
  
         /*
@@ -1497,7 +1532,8 @@ exit_journal:
                 for (; gdb_num <= gdb_num_end; gdb_num++) {
                         struct buffer_head *gdb_bh;
  
-                       gdb_bh = sbi->s_group_desc[gdb_num];
+                       gdb_bh = sbi_array_rcu_deref(sbi, s_group_desc,
+                                                    gdb_num);
                         if (old_gdb == gdb_bh->b_blocknr)
                                 continue;
                         update_backups(sb, gdb_bh->b_blocknr, gdb_bh->b_data,
diff --git a/fs/ext4/super.c b/fs/ext4/super.c

index 8434217549b3055daac18216914c8d4d235beef3..0c7c4adb664ec993651ae6dabba860ccd7827567 100644 (file)
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1014,6 +1014,8 @@ static void ext4_put_super(struct super_block *sb)
  {
         struct ext4_sb_info *sbi = EXT4_SB(sb);
         struct ext4_super_block *es = sbi->s_es;
+       struct buffer_head **group_desc;
+       struct flex_groups **flex_groups;
         int aborted = 0;
         int i, err;
  
@@ -1046,15 +1048,23 @@ static void ext4_put_super(struct super_block *sb)
         if (!sb_rdonly(sb))
                 ext4_commit_super(sb, 1);
  
+       rcu_read_lock();
+       group_desc = rcu_dereference(sbi->s_group_desc);
         for (i = 0; i < sbi->s_gdb_count; i++)
-               brelse(sbi->s_group_desc[i]);
-       kvfree(sbi->s_group_desc);
-       kvfree(sbi->s_flex_groups);
+               brelse(group_desc[i]);
+       kvfree(group_desc);
+       flex_groups = rcu_dereference(sbi->s_flex_groups);
+       if (flex_groups) {
+               for (i = 0; i < sbi->s_flex_groups_allocated; i++)
+                       kvfree(flex_groups[i]);
+               kvfree(flex_groups);
+       }
+       rcu_read_unlock();
         percpu_counter_destroy(&sbi->s_freeclusters_counter);
         percpu_counter_destroy(&sbi->s_freeinodes_counter);
         percpu_counter_destroy(&sbi->s_dirs_counter);
         percpu_counter_destroy(&sbi->s_dirtyclusters_counter);
-       percpu_free_rwsem(&sbi->s_journal_flag_rwsem);
+       percpu_free_rwsem(&sbi->s_writepages_rwsem);
  #ifdef CONFIG_QUOTA
         for (i = 0; i < EXT4_MAXQUOTAS; i++)
                 kfree(get_qf_name(sb, sbi, i));
@@ -2380,8 +2390,8 @@ done:
  int ext4_alloc_flex_bg_array(struct super_block *sb, ext4_group_t ngroup)
  {
         struct ext4_sb_info *sbi = EXT4_SB(sb);
-       struct flex_groups *new_groups;
-       int size;
+       struct flex_groups **old_groups, **new_groups;
+       int size, i, j;
  
         if (!sbi->s_log_groups_per_flex)
                 return 0;
@@ -2390,22 +2400,37 @@ int ext4_alloc_flex_bg_array(struct super_block *sb, ext4_group_t ngroup)
         if (size <= sbi->s_flex_groups_allocated)
                 return 0;
  
-       size = roundup_pow_of_two(size * sizeof(struct flex_groups));
-       new_groups = kvzalloc(size, GFP_KERNEL);
+       new_groups = kvzalloc(roundup_pow_of_two(size *
+                             sizeof(*sbi->s_flex_groups)), GFP_KERNEL);
         if (!new_groups) {
-               ext4_msg(sb, KERN_ERR, "not enough memory for %d flex groups",
-                        size / (int) sizeof(struct flex_groups));
+               ext4_msg(sb, KERN_ERR,
+                        "not enough memory for %d flex group pointers", size);
                 return -ENOMEM;
         }
-
-       if (sbi->s_flex_groups) {
-               memcpy(new_groups, sbi->s_flex_groups,
-                      (sbi->s_flex_groups_allocated *
-                       sizeof(struct flex_groups)));
-               kvfree(sbi->s_flex_groups);
+       for (i = sbi->s_flex_groups_allocated; i < size; i++) {
+               new_groups[i] = kvzalloc(roundup_pow_of_two(
+                                        sizeof(struct flex_groups)),
+                                        GFP_KERNEL);
+               if (!new_groups[i]) {
+                       for (j = sbi->s_flex_groups_allocated; j < i; j++)
+                               kvfree(new_groups[j]);
+                       kvfree(new_groups);
+                       ext4_msg(sb, KERN_ERR,
+                                "not enough memory for %d flex groups", size);
+                       return -ENOMEM;
+               }
         }
-       sbi->s_flex_groups = new_groups;
-       sbi->s_flex_groups_allocated = size / sizeof(struct flex_groups);
+       rcu_read_lock();
+       old_groups = rcu_dereference(sbi->s_flex_groups);
+       if (old_groups)
+               memcpy(new_groups, old_groups,
+                      (sbi->s_flex_groups_allocated *
+                       sizeof(struct flex_groups *)));
+       rcu_read_unlock();
+       rcu_assign_pointer(sbi->s_flex_groups, new_groups);
+       sbi->s_flex_groups_allocated = size;
+       if (old_groups)
+               ext4_kvfree_array_rcu(old_groups);
         return 0;
  }
  
@@ -2413,6 +2438,7 @@ static int ext4_fill_flex_info(struct super_block *sb)
  {
         struct ext4_sb_info *sbi = EXT4_SB(sb);
         struct ext4_group_desc *gdp = NULL;
+       struct flex_groups *fg;
         ext4_group_t flex_group;
         int i, err;
  
@@ -2430,12 +2456,11 @@ static int ext4_fill_flex_info(struct super_block *sb)
                 gdp = ext4_get_group_desc(sb, i, NULL);
  
                 flex_group = ext4_flex_group(sbi, i);
-               atomic_add(ext4_free_inodes_count(sb, gdp),
-                          &sbi->s_flex_groups[flex_group].free_inodes);
+               fg = sbi_array_rcu_deref(sbi, s_flex_groups, flex_group);
+               atomic_add(ext4_free_inodes_count(sb, gdp), &fg->free_inodes);
                 atomic64_add(ext4_free_group_clusters(sb, gdp),
-                            &sbi->s_flex_groups[flex_group].free_clusters);
-               atomic_add(ext4_used_dirs_count(sb, gdp),
-                          &sbi->s_flex_groups[flex_group].used_dirs);
+                            &fg->free_clusters);
+               atomic_add(ext4_used_dirs_count(sb, gdp), &fg->used_dirs);
         }
  
         return 1;
@@ -3009,17 +3034,11 @@ static int ext4_feature_set_ok(struct super_block *sb, int readonly)
                 return 0;
         }
  
-#ifndef CONFIG_QUOTA
-       if (ext4_has_feature_quota(sb) && !readonly) {
+#if !IS_ENABLED(CONFIG_QUOTA) || !IS_ENABLED(CONFIG_QFMT_V2)
+       if (!readonly && (ext4_has_feature_quota(sb) ||
+                         ext4_has_feature_project(sb))) {
                 ext4_msg(sb, KERN_ERR,
-                        "Filesystem with quota feature cannot be mounted RDWR "
-                        "without CONFIG_QUOTA");
-               return 0;
-       }
-       if (ext4_has_feature_project(sb) && !readonly) {
-               ext4_msg(sb, KERN_ERR,
-                        "Filesystem with project quota feature cannot be mounted RDWR "
-                        "without CONFIG_QUOTA");
+                        "The kernel was not built with CONFIG_QUOTA and CONFIG_QFMT_V2");
                 return 0;
         }
  #endif  /* CONFIG_QUOTA */
@@ -3640,9 +3659,10 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
  {
         struct dax_device *dax_dev = fs_dax_get_by_bdev(sb->s_bdev);
         char *orig_data = kstrdup(data, GFP_KERNEL);
-       struct buffer_head *bh;
+       struct buffer_head *bh, **group_desc;
         struct ext4_super_block *es = NULL;
         struct ext4_sb_info *sbi = kzalloc(sizeof(*sbi), GFP_KERNEL);
+       struct flex_groups **flex_groups;
         ext4_fsblk_t block;
         ext4_fsblk_t sb_block = get_sb_block(&data);
         ext4_fsblk_t logical_sb_block;
@@ -3814,6 +3834,15 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
          */
         sbi->s_li_wait_mult = EXT4_DEF_LI_WAIT_MULT;
  
+       blocksize = BLOCK_SIZE << le32_to_cpu(es->s_log_block_size);
+       if (blocksize < EXT4_MIN_BLOCK_SIZE ||
+           blocksize > EXT4_MAX_BLOCK_SIZE) {
+               ext4_msg(sb, KERN_ERR,
+                      "Unsupported filesystem blocksize %d (%d log_block_size)",
+                        blocksize, le32_to_cpu(es->s_log_block_size));
+               goto failed_mount;
+       }
+
         if (le32_to_cpu(es->s_rev_level) == EXT4_GOOD_OLD_REV) {
                 sbi->s_inode_size = EXT4_GOOD_OLD_INODE_SIZE;
                 sbi->s_first_ino = EXT4_GOOD_OLD_FIRST_INO;
@@ -3831,6 +3860,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
                         ext4_msg(sb, KERN_ERR,
                                "unsupported inode size: %d",
                                sbi->s_inode_size);
+                       ext4_msg(sb, KERN_ERR, "blocksize: %d", blocksize);
                         goto failed_mount;
                 }
                 /*
@@ -4033,14 +4063,6 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
         if (!ext4_feature_set_ok(sb, (sb_rdonly(sb))))
                 goto failed_mount;
  
-       blocksize = BLOCK_SIZE << le32_to_cpu(es->s_log_block_size);
-       if (blocksize < EXT4_MIN_BLOCK_SIZE ||
-           blocksize > EXT4_MAX_BLOCK_SIZE) {
-               ext4_msg(sb, KERN_ERR,
-                      "Unsupported filesystem blocksize %d (%d log_block_size)",
-                        blocksize, le32_to_cpu(es->s_log_block_size));
-               goto failed_mount;
-       }
         if (le32_to_cpu(es->s_log_block_size) >
             (EXT4_MAX_BLOCK_LOG_SIZE - EXT4_MIN_BLOCK_LOG_SIZE)) {
                 ext4_msg(sb, KERN_ERR,
@@ -4294,9 +4316,10 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
                         goto failed_mount;
                 }
         }
-       sbi->s_group_desc = kvmalloc_array(db_count,
-                                          sizeof(struct buffer_head *),
-                                          GFP_KERNEL);
+       rcu_assign_pointer(sbi->s_group_desc,
+                          kvmalloc_array(db_count,
+                                         sizeof(struct buffer_head *),
+                                         GFP_KERNEL));
         if (sbi->s_group_desc == NULL) {
                 ext4_msg(sb, KERN_ERR, "not enough memory");
                 ret = -ENOMEM;
@@ -4312,14 +4335,19 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
         }
  
         for (i = 0; i < db_count; i++) {
+               struct buffer_head *bh;
+
                 block = descriptor_loc(sb, logical_sb_block, i);
-               sbi->s_group_desc[i] = sb_bread_unmovable(sb, block);
-               if (!sbi->s_group_desc[i]) {
+               bh = sb_bread_unmovable(sb, block);
+               if (!bh) {
                         ext4_msg(sb, KERN_ERR,
                                "can't read group descriptor %d", i);
                         db_count = i;
                         goto failed_mount2;
                 }
+               rcu_read_lock();
+               rcu_dereference(sbi->s_group_desc)[i] = bh;
+               rcu_read_unlock();
         }
         sbi->s_gdb_count = db_count;
         if (!ext4_check_descriptors(sb, logical_sb_block, &first_not_zeroed)) {
@@ -4598,7 +4626,7 @@ no_journal:
                 err = percpu_counter_init(&sbi->s_dirtyclusters_counter, 0,
                                           GFP_KERNEL);
         if (!err)
-               err = percpu_init_rwsem(&sbi->s_journal_flag_rwsem);
+               err = percpu_init_rwsem(&sbi->s_writepages_rwsem);
  
         if (err) {
                 ext4_msg(sb, KERN_ERR, "insufficient memory");
@@ -4686,13 +4714,19 @@ failed_mount7:
         ext4_unregister_li_request(sb);
  failed_mount6:
         ext4_mb_release(sb);
-       if (sbi->s_flex_groups)
-               kvfree(sbi->s_flex_groups);
+       rcu_read_lock();
+       flex_groups = rcu_dereference(sbi->s_flex_groups);
+       if (flex_groups) {
+               for (i = 0; i < sbi->s_flex_groups_allocated; i++)
+                       kvfree(flex_groups[i]);
+               kvfree(flex_groups);
+       }
+       rcu_read_unlock();
         percpu_counter_destroy(&sbi->s_freeclusters_counter);
         percpu_counter_destroy(&sbi->s_freeinodes_counter);
         percpu_counter_destroy(&sbi->s_dirs_counter);
         percpu_counter_destroy(&sbi->s_dirtyclusters_counter);
-       percpu_free_rwsem(&sbi->s_journal_flag_rwsem);
+       percpu_free_rwsem(&sbi->s_writepages_rwsem);
  failed_mount5:
         ext4_ext_release(sb);
         ext4_release_system_zone(sb);
@@ -4721,9 +4755,12 @@ failed_mount3:
         if (sbi->s_mmp_tsk)
                 kthread_stop(sbi->s_mmp_tsk);
  failed_mount2:
+       rcu_read_lock();
+       group_desc = rcu_dereference(sbi->s_group_desc);
         for (i = 0; i < db_count; i++)
-               brelse(sbi->s_group_desc[i]);
-       kvfree(sbi->s_group_desc);
+               brelse(group_desc[i]);
+       kvfree(group_desc);
+       rcu_read_unlock();
  failed_mount:
         if (sbi->s_chksum_driver)
                 crypto_free_shash(sbi->s_chksum_driver);
@@ -5585,10 +5622,7 @@ static int ext4_statfs_project(struct super_block *sb,
                 return PTR_ERR(dquot);
         spin_lock(&dquot->dq_dqb_lock);
  
-       limit = 0;
-       if (dquot->dq_dqb.dqb_bsoftlimit &&
-           (!limit || dquot->dq_dqb.dqb_bsoftlimit < limit))
-               limit = dquot->dq_dqb.dqb_bsoftlimit;
+       limit = dquot->dq_dqb.dqb_bsoftlimit;
         if (dquot->dq_dqb.dqb_bhardlimit &&
             (!limit || dquot->dq_dqb.dqb_bhardlimit < limit))
                 limit = dquot->dq_dqb.dqb_bhardlimit;
@@ -5603,10 +5637,7 @@ static int ext4_statfs_project(struct super_block *sb,
                          (buf->f_blocks - curblock) : 0;
         }
  
-       limit = 0;
-       if (dquot->dq_dqb.dqb_isoftlimit &&
-           (!limit || dquot->dq_dqb.dqb_isoftlimit < limit))
-               limit = dquot->dq_dqb.dqb_isoftlimit;
+       limit = dquot->dq_dqb.dqb_isoftlimit;
         if (dquot->dq_dqb.dqb_ihardlimit &&
             (!limit || dquot->dq_dqb.dqb_ihardlimit < limit))
                 limit = dquot->dq_dqb.dqb_ihardlimit;
diff --git a/fs/io-wq.c b/fs/io-wq.c

index cb60a42b9fdfa5bb1aa1832e3d0277051d821c69..bf8ed1b0b90a01ce356588ed8d9c1184ca2510f7 100644 (file)
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -16,6 +16,7 @@
  #include <linux/slab.h>
  #include <linux/kthread.h>
  #include <linux/rculist_nulls.h>
+#include <linux/fs_struct.h>
  
  #include "io-wq.h"
  
@@ -59,6 +60,7 @@ struct io_worker {
         const struct cred *cur_creds;
         const struct cred *saved_creds;
         struct files_struct *restore_files;
+       struct fs_struct *restore_fs;
  };
  
  #if BITS_PER_LONG == 64
@@ -151,6 +153,9 @@ static bool __io_worker_unuse(struct io_wqe *wqe, struct io_worker *worker)
                 task_unlock(current);
         }
  
+       if (current->fs != worker->restore_fs)
+               current->fs = worker->restore_fs;
+
         /*
          * If we have an active mm, we need to drop the wq lock before unusing
          * it. If we do, return true and let the caller retry the idle loop.
@@ -311,6 +316,7 @@ static void io_worker_start(struct io_wqe *wqe, struct io_worker *worker)
  
         worker->flags |= (IO_WORKER_F_UP | IO_WORKER_F_RUNNING);
         worker->restore_files = current->files;
+       worker->restore_fs = current->fs;
         io_wqe_inc_running(wqe, worker);
  }
  
@@ -481,6 +487,8 @@ next:
                         current->files = work->files;
                         task_unlock(current);
                 }
+               if (work->fs && current->fs != work->fs)
+                       current->fs = work->fs;
                 if (work->mm != worker->mm)
                         io_wq_switch_mm(worker, work);
                 if (worker->cur_creds != work->creds)
@@ -527,42 +535,23 @@ next:
         } while (1);
  }
  
-static inline void io_worker_spin_for_work(struct io_wqe *wqe)
-{
-       int i = 0;
-
-       while (++i < 1000) {
-               if (io_wqe_run_queue(wqe))
-                       break;
-               if (need_resched())
-                       break;
-               cpu_relax();
-       }
-}
-
  static int io_wqe_worker(void *data)
  {
         struct io_worker *worker = data;
         struct io_wqe *wqe = worker->wqe;
         struct io_wq *wq = wqe->wq;
-       bool did_work;
  
         io_worker_start(wqe, worker);
  
-       did_work = false;
         while (!test_bit(IO_WQ_BIT_EXIT, &wq->state)) {
                 set_current_state(TASK_INTERRUPTIBLE);
  loop:
-               if (did_work)
-                       io_worker_spin_for_work(wqe);
                 spin_lock_irq(&wqe->lock);
                 if (io_wqe_run_queue(wqe)) {
                         __set_current_state(TASK_RUNNING);
                         io_worker_handle_work(worker);
-                       did_work = true;
                         goto loop;
                 }
-               did_work = false;
                 /* drops the lock on success, retry */
                 if (__io_worker_idle(wqe, worker)) {
                         __release(&wqe->lock);
@@ -691,11 +680,16 @@ static int io_wq_manager(void *data)
         /* create fixed workers */
         refcount_set(&wq->refs, workers_to_create);
         for_each_node(node) {
+               if (!node_online(node))
+                       continue;
                 if (!create_io_worker(wq, wq->wqes[node], IO_WQ_ACCT_BOUND))
                         goto err;
                 workers_to_create--;
         }
  
+       while (workers_to_create--)
+               refcount_dec(&wq->refs);
+
         complete(&wq->done);
  
         while (!kthread_should_stop()) {
@@ -703,6 +697,9 @@ static int io_wq_manager(void *data)
                         struct io_wqe *wqe = wq->wqes[node];
                         bool fork_worker[2] = { false, false };
  
+                       if (!node_online(node))
+                               continue;
+
                         spin_lock_irq(&wqe->lock);
                         if (io_wqe_need_worker(wqe, IO_WQ_ACCT_BOUND))
                                 fork_worker[IO_WQ_ACCT_BOUND] = true;
@@ -821,7 +818,9 @@ static bool io_wq_for_each_worker(struct io_wqe *wqe,
  
         list_for_each_entry_rcu(worker, &wqe->all_list, all_list) {
                 if (io_worker_get(worker)) {
-                       ret = func(worker, data);
+                       /* no task if node is/was offline */
+                       if (worker->task)
+                               ret = func(worker, data);
                         io_worker_release(worker);
                         if (ret)
                                 break;
@@ -929,17 +928,19 @@ enum io_wq_cancel io_wq_cancel_cb(struct io_wq *wq, work_cancel_fn *cancel,
         return ret;
  }
  
+struct work_match {
+       bool (*fn)(struct io_wq_work *, void *data);
+       void *data;
+};
+
  static bool io_wq_worker_cancel(struct io_worker *worker, void *data)
  {
-       struct io_wq_work *work = data;
+       struct work_match *match = data;
         unsigned long flags;
         bool ret = false;
  
-       if (worker->cur_work != work)
-               return false;
-
         spin_lock_irqsave(&worker->lock, flags);
-       if (worker->cur_work == work &&
+       if (match->fn(worker->cur_work, match->data) &&
             !(worker->cur_work->flags & IO_WQ_WORK_NO_CANCEL)) {
                 send_sig(SIGINT, worker->task, 1);
                 ret = true;
@@ -950,15 +951,13 @@ static bool io_wq_worker_cancel(struct io_worker *worker, void *data)
  }
  
  static enum io_wq_cancel io_wqe_cancel_work(struct io_wqe *wqe,
-                                           struct io_wq_work *cwork)
+                                           struct work_match *match)
  {
         struct io_wq_work_node *node, *prev;
         struct io_wq_work *work;
         unsigned long flags;
         bool found = false;
  
-       cwork->flags |= IO_WQ_WORK_CANCEL;
-
         /*
          * First check pending list, if we're lucky we can just remove it
          * from there. CANCEL_OK means that the work is returned as-new,
@@ -968,7 +967,7 @@ static enum io_wq_cancel io_wqe_cancel_work(struct io_wqe *wqe,
         wq_list_for_each(node, prev, &wqe->work_list) {
                 work = container_of(node, struct io_wq_work, list);
  
-               if (work == cwork) {
+               if (match->fn(work, match->data)) {
                         wq_node_del(&wqe->work_list, node, prev);
                         found = true;
                         break;
@@ -989,20 +988,60 @@ static enum io_wq_cancel io_wqe_cancel_work(struct io_wqe *wqe,
          * completion will run normally in this case.
          */
         rcu_read_lock();
-       found = io_wq_for_each_worker(wqe, io_wq_worker_cancel, cwork);
+       found = io_wq_for_each_worker(wqe, io_wq_worker_cancel, match);
         rcu_read_unlock();
         return found ? IO_WQ_CANCEL_RUNNING : IO_WQ_CANCEL_NOTFOUND;
  }
  
+static bool io_wq_work_match(struct io_wq_work *work, void *data)
+{
+       return work == data;
+}
+
  enum io_wq_cancel io_wq_cancel_work(struct io_wq *wq, struct io_wq_work *cwork)
  {
+       struct work_match match = {
+               .fn     = io_wq_work_match,
+               .data   = cwork
+       };
+       enum io_wq_cancel ret = IO_WQ_CANCEL_NOTFOUND;
+       int node;
+
+       cwork->flags |= IO_WQ_WORK_CANCEL;
+
+       for_each_node(node) {
+               struct io_wqe *wqe = wq->wqes[node];
+
+               ret = io_wqe_cancel_work(wqe, &match);
+               if (ret != IO_WQ_CANCEL_NOTFOUND)
+                       break;
+       }
+
+       return ret;
+}
+
+static bool io_wq_pid_match(struct io_wq_work *work, void *data)
+{
+       pid_t pid = (pid_t) (unsigned long) data;
+
+       if (work)
+               return work->task_pid == pid;
+       return false;
+}
+
+enum io_wq_cancel io_wq_cancel_pid(struct io_wq *wq, pid_t pid)
+{
+       struct work_match match = {
+               .fn     = io_wq_pid_match,
+               .data   = (void *) (unsigned long) pid
+       };
         enum io_wq_cancel ret = IO_WQ_CANCEL_NOTFOUND;
         int node;
  
         for_each_node(node) {
                 struct io_wqe *wqe = wq->wqes[node];
  
-               ret = io_wqe_cancel_work(wqe, cwork);
+               ret = io_wqe_cancel_work(wqe, &match);
                 if (ret != IO_WQ_CANCEL_NOTFOUND)
                         break;
         }
@@ -1036,6 +1075,8 @@ void io_wq_flush(struct io_wq *wq)
         for_each_node(node) {
                 struct io_wqe *wqe = wq->wqes[node];
  
+               if (!node_online(node))
+                       continue;
                 init_completion(&data.done);
                 INIT_IO_WORK(&data.work, io_wq_flush_func);
                 data.work.flags |= IO_WQ_WORK_INTERNAL;
@@ -1067,12 +1108,15 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data)
  
         for_each_node(node) {
                 struct io_wqe *wqe;
+               int alloc_node = node;
  
-               wqe = kzalloc_node(sizeof(struct io_wqe), GFP_KERNEL, node);
+               if (!node_online(alloc_node))
+                       alloc_node = NUMA_NO_NODE;
+               wqe = kzalloc_node(sizeof(struct io_wqe), GFP_KERNEL, alloc_node);
                 if (!wqe)
                         goto err;
                 wq->wqes[node] = wqe;
-               wqe->node = node;
+               wqe->node = alloc_node;
                 wqe->acct[IO_WQ_ACCT_BOUND].max_workers = bounded;
                 atomic_set(&wqe->acct[IO_WQ_ACCT_BOUND].nr_running, 0);
                 if (wq->user) {
@@ -1080,7 +1124,6 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data)
                                         task_rlimit(current, RLIMIT_NPROC);
                 }
                 atomic_set(&wqe->acct[IO_WQ_ACCT_UNBOUND].nr_running, 0);
-               wqe->node = node;
                 wqe->wq = wq;
                 spin_lock_init(&wqe->lock);
                 INIT_WQ_LIST(&wqe->work_list);
diff --git a/fs/io-wq.h b/fs/io-wq.h

index 50b3378febf2f3bb01d9b9cf0a8f582d0a72a50e..33baba4370c5f26365e9f9ffc75cba71fb364a04 100644 (file)
--- a/fs/io-wq.h
+++ b/fs/io-wq.h
@@ -74,18 +74,15 @@ struct io_wq_work {
         struct files_struct *files;
         struct mm_struct *mm;
         const struct cred *creds;
+       struct fs_struct *fs;
         unsigned flags;
+       pid_t task_pid;
  };
  
-#define INIT_IO_WORK(work, _func)                      \
-       do {                                            \
-               (work)->list.next = NULL;               \
-               (work)->func = _func;                   \
-               (work)->flags = 0;                      \
-               (work)->files = NULL;                   \
-               (work)->mm = NULL;                      \
-               (work)->creds = NULL;                   \
-       } while (0)                                     \
+#define INIT_IO_WORK(work, _func)                              \
+       do {                                                    \
+               *(work) = (struct io_wq_work){ .func = _func }; \
+       } while (0)                                             \
  
  typedef void (get_work_fn)(struct io_wq_work *);
  typedef void (put_work_fn)(struct io_wq_work *);
@@ -107,6 +104,7 @@ void io_wq_flush(struct io_wq *wq);
  
  void io_wq_cancel_all(struct io_wq *wq);
  enum io_wq_cancel io_wq_cancel_work(struct io_wq *wq, struct io_wq_work *cwork);
+enum io_wq_cancel io_wq_cancel_pid(struct io_wq *wq, pid_t pid);
  
  typedef bool (work_cancel_fn)(struct io_wq_work *, void *);
  
diff --git a/fs/io_uring.c b/fs/io_uring.c

index 77f22c3da30f59108cb207aca8c78c3f23b6720d..6a595c13e108ce2ad6a564e580fc271626623796 100644 (file)
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -75,6 +75,7 @@
  #include <linux/fsnotify.h>
  #include <linux/fadvise.h>
  #include <linux/eventpoll.h>
+#include <linux/fs_struct.h>
  
  #define CREATE_TRACE_POINTS
  #include <trace/events/io_uring.h>
@@ -182,17 +183,12 @@ struct fixed_file_table {
         struct file             **files;
  };
  
-enum {
-       FFD_F_ATOMIC,
-};
-
  struct fixed_file_data {
         struct fixed_file_table         *table;
         struct io_ring_ctx              *ctx;
  
         struct percpu_ref               refs;
         struct llist_head               put_llist;
-       unsigned long                   state;
         struct work_struct              ref_work;
         struct completion               done;
  };
@@ -204,11 +200,11 @@ struct io_ring_ctx {
  
         struct {
                 unsigned int            flags;
-               int                     compat: 1;
-               int                     account_mem: 1;
-               int                     cq_overflow_flushed: 1;
-               int                     drain_next: 1;
-               int                     eventfd_async: 1;
+               unsigned int            compat: 1;
+               unsigned int            account_mem: 1;
+               unsigned int            cq_overflow_flushed: 1;
+               unsigned int            drain_next: 1;
+               unsigned int            eventfd_async: 1;
  
                 /*
                  * Ring buffer of indices into array of io_uring_sqe, which is
@@ -441,6 +437,7 @@ struct io_async_msghdr {
         struct iovec                    *iov;
         struct sockaddr __user          *uaddr;
         struct msghdr                   msg;
+       struct sockaddr_storage         addr;
  };
  
  struct io_async_rw {
@@ -450,17 +447,12 @@ struct io_async_rw {
         ssize_t                         size;
  };
  
-struct io_async_open {
-       struct filename                 *filename;
-};
-
  struct io_async_ctx {
         union {
                 struct io_async_rw      rw;
                 struct io_async_msghdr  msg;
                 struct io_async_connect connect;
                 struct io_timeout_data  timeout;
-               struct io_async_open    open;
         };
  };
  
@@ -483,6 +475,8 @@ enum {
         REQ_F_MUST_PUNT_BIT,
         REQ_F_TIMEOUT_NOSEQ_BIT,
         REQ_F_COMP_LOCKED_BIT,
+       REQ_F_NEED_CLEANUP_BIT,
+       REQ_F_OVERFLOW_BIT,
  };
  
  enum {
@@ -521,6 +515,10 @@ enum {
         REQ_F_TIMEOUT_NOSEQ     = BIT(REQ_F_TIMEOUT_NOSEQ_BIT),
         /* completion under lock */
         REQ_F_COMP_LOCKED       = BIT(REQ_F_COMP_LOCKED_BIT),
+       /* needs cleanup */
+       REQ_F_NEED_CLEANUP      = BIT(REQ_F_NEED_CLEANUP_BIT),
+       /* in overflow list */
+       REQ_F_OVERFLOW          = BIT(REQ_F_OVERFLOW_BIT),
  };
  
  /*
@@ -553,7 +551,6 @@ struct io_kiocb {
          * llist_node is only used for poll deferred completions
          */
         struct llist_node               llist_node;
-       bool                            has_user;
         bool                            in_async;
         bool                            needs_fixed_file;
         u8                              opcode;
@@ -614,6 +611,8 @@ struct io_op_def {
         unsigned                not_supported : 1;
         /* needs file table */
         unsigned                file_table : 1;
+       /* needs ->fs */
+       unsigned                needs_fs : 1;
  };
  
  static const struct io_op_def io_op_defs[] = {
@@ -656,12 +655,14 @@ static const struct io_op_def io_op_defs[] = {
                 .needs_mm               = 1,
                 .needs_file             = 1,
                 .unbound_nonreg_file    = 1,
+               .needs_fs               = 1,
         },
         [IORING_OP_RECVMSG] = {
                 .async_ctx              = 1,
                 .needs_mm               = 1,
                 .needs_file             = 1,
                 .unbound_nonreg_file    = 1,
+               .needs_fs               = 1,
         },
         [IORING_OP_TIMEOUT] = {
                 .async_ctx              = 1,
@@ -692,6 +693,7 @@ static const struct io_op_def io_op_defs[] = {
                 .needs_file             = 1,
                 .fd_non_neg             = 1,
                 .file_table             = 1,
+               .needs_fs               = 1,
         },
         [IORING_OP_CLOSE] = {
                 .needs_file             = 1,
@@ -705,6 +707,7 @@ static const struct io_op_def io_op_defs[] = {
                 .needs_mm               = 1,
                 .needs_file             = 1,
                 .fd_non_neg             = 1,
+               .needs_fs               = 1,
         },
         [IORING_OP_READ] = {
                 .needs_mm               = 1,
@@ -736,6 +739,7 @@ static const struct io_op_def io_op_defs[] = {
                 .needs_file             = 1,
                 .fd_non_neg             = 1,
                 .file_table             = 1,
+               .needs_fs               = 1,
         },
         [IORING_OP_EPOLL_CTL] = {
                 .unbound_nonreg_file    = 1,
@@ -754,6 +758,7 @@ static int __io_sqe_files_update(struct io_ring_ctx *ctx,
                                  unsigned nr_args);
  static int io_grab_files(struct io_kiocb *req);
  static void io_ring_file_ref_flush(struct fixed_file_data *data);
+static void io_cleanup_req(struct io_kiocb *req);
  
  static struct kmem_cache *req_cachep;
  
@@ -909,6 +914,18 @@ static inline void io_req_work_grab_env(struct io_kiocb *req,
         }
         if (!req->work.creds)
                 req->work.creds = get_current_cred();
+       if (!req->work.fs && def->needs_fs) {
+               spin_lock(&current->fs->lock);
+               if (!current->fs->in_exec) {
+                       req->work.fs = current->fs;
+                       req->work.fs->users++;
+               } else {
+                       req->work.flags |= IO_WQ_WORK_CANCEL;
+               }
+               spin_unlock(&current->fs->lock);
+       }
+       if (!req->work.task_pid)
+               req->work.task_pid = task_pid_vnr(current);
  }
  
  static inline void io_req_work_drop_env(struct io_kiocb *req)
@@ -921,6 +938,16 @@ static inline void io_req_work_drop_env(struct io_kiocb *req)
                 put_cred(req->work.creds);
                 req->work.creds = NULL;
         }
+       if (req->work.fs) {
+               struct fs_struct *fs = req->work.fs;
+
+               spin_lock(&req->work.fs->lock);
+               if (--fs->users)
+                       fs = NULL;
+               spin_unlock(&req->work.fs->lock);
+               if (fs)
+                       free_fs_struct(fs);
+       }
  }
  
  static inline bool io_prep_async_work(struct io_kiocb *req,
@@ -1074,6 +1101,7 @@ static bool io_cqring_overflow_flush(struct io_ring_ctx *ctx, bool force)
                 req = list_first_entry(&ctx->cq_overflow_list, struct io_kiocb,
                                                 list);
                 list_move(&req->list, &list);
+               req->flags &= ~REQ_F_OVERFLOW;
                 if (cqe) {
                         WRITE_ONCE(cqe->user_data, req->user_data);
                         WRITE_ONCE(cqe->res, req->result);
@@ -1126,6 +1154,7 @@ static void io_cqring_fill_event(struct io_kiocb *req, long res)
                         set_bit(0, &ctx->sq_check_overflow);
                         set_bit(0, &ctx->cq_check_overflow);
                 }
+               req->flags |= REQ_F_OVERFLOW;
                 refcount_inc(&req->refs);
                 req->result = res;
                 list_add_tail(&req->list, &ctx->cq_overflow_list);
@@ -1226,6 +1255,9 @@ static void __io_req_aux_free(struct io_kiocb *req)
  {
         struct io_ring_ctx *ctx = req->ctx;
  
+       if (req->flags & REQ_F_NEED_CLEANUP)
+               io_cleanup_req(req);
+
         kfree(req->io);
         if (req->file) {
                 if (req->flags & REQ_F_FIXED_FILE)
@@ -1446,10 +1478,10 @@ static void io_free_req(struct io_kiocb *req)
  __attribute__((nonnull))
  static void io_put_req_find_next(struct io_kiocb *req, struct io_kiocb **nxtptr)
  {
-       io_req_find_next(req, nxtptr);
-
-       if (refcount_dec_and_test(&req->refs))
+       if (refcount_dec_and_test(&req->refs)) {
+               io_req_find_next(req, nxtptr);
                 __io_free_req(req);
+       }
  }
  
  static void io_put_req(struct io_kiocb *req)
@@ -1635,11 +1667,17 @@ static void io_iopoll_reap_events(struct io_ring_ctx *ctx)
         mutex_unlock(&ctx->uring_lock);
  }
  
-static int __io_iopoll_check(struct io_ring_ctx *ctx, unsigned *nr_events,
-                           long min)
+static int io_iopoll_check(struct io_ring_ctx *ctx, unsigned *nr_events,
+                          long min)
  {
         int iters = 0, ret = 0;
  
+       /*
+        * We disallow the app entering submit/complete with polling, but we
+        * still need to lock the ring to prevent racing with polled issue
+        * that got punted to a workqueue.
+        */
+       mutex_lock(&ctx->uring_lock);
         do {
                 int tmin = 0;
  
@@ -1675,21 +1713,6 @@ static int __io_iopoll_check(struct io_ring_ctx *ctx, unsigned *nr_events,
                 ret = 0;
         } while (min && !*nr_events && !need_resched());
  
-       return ret;
-}
-
-static int io_iopoll_check(struct io_ring_ctx *ctx, unsigned *nr_events,
-                          long min)
-{
-       int ret;
-
-       /*
-        * We disallow the app entering submit/complete with polling, but we
-        * still need to lock the ring to prevent racing with polled issue
-        * that got punted to a workqueue.
-        */
-       mutex_lock(&ctx->uring_lock);
-       ret = __io_iopoll_check(ctx, nr_events, min);
         mutex_unlock(&ctx->uring_lock);
         return ret;
  }
@@ -1793,6 +1816,10 @@ static void io_iopoll_req_issued(struct io_kiocb *req)
                 list_add(&req->list, &ctx->poll_list);
         else
                 list_add_tail(&req->list, &ctx->poll_list);
+
+       if ((ctx->flags & IORING_SETUP_SQPOLL) &&
+           wq_has_sleeper(&ctx->sqo_wait))
+               wake_up(&ctx->sqo_wait);
  }
  
  static void io_file_put(struct io_submit_state *state)
@@ -2043,7 +2070,7 @@ static ssize_t io_import_iovec(int rw, struct io_kiocb *req,
                 ssize_t ret;
                 ret = import_single_range(rw, buf, sqe_len, *iovec, iter);
                 *iovec = NULL;
-               return ret;
+               return ret < 0 ? ret : sqe_len;
         }
  
         if (req->io) {
@@ -2056,9 +2083,6 @@ static ssize_t io_import_iovec(int rw, struct io_kiocb *req,
                 return iorw->size;
         }
  
-       if (!req->has_user)
-               return -EFAULT;
-
  #ifdef CONFIG_COMPAT
         if (req->ctx->compat)
                 return compat_import_iovec(rw, buf, sqe_len, UIO_FASTIOV,
@@ -2137,6 +2161,8 @@ static void io_req_map_rw(struct io_kiocb *req, ssize_t io_size,
                 req->io->rw.iov = req->io->rw.fast_iov;
                 memcpy(req->io->rw.iov, fast_iov,
                         sizeof(struct iovec) * iter->nr_segs);
+       } else {
+               req->flags |= REQ_F_NEED_CLEANUP;
         }
  }
  
@@ -2148,17 +2174,6 @@ static int io_alloc_async_ctx(struct io_kiocb *req)
         return req->io == NULL;
  }
  
-static void io_rw_async(struct io_wq_work **workptr)
-{
-       struct io_kiocb *req = container_of(*workptr, struct io_kiocb, work);
-       struct iovec *iov = NULL;
-
-       if (req->io->rw.iov != req->io->rw.fast_iov)
-               iov = req->io->rw.iov;
-       io_wq_submit_work(workptr);
-       kfree(iov);
-}
-
  static int io_setup_async_rw(struct io_kiocb *req, ssize_t io_size,
                              struct iovec *iovec, struct iovec *fast_iov,
                              struct iov_iter *iter)
@@ -2171,7 +2186,6 @@ static int io_setup_async_rw(struct io_kiocb *req, ssize_t io_size,
  
                 io_req_map_rw(req, io_size, iovec, fast_iov, iter);
         }
-       req->work.func = io_rw_async;
         return 0;
  }
  
@@ -2189,7 +2203,8 @@ static int io_read_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe,
         if (unlikely(!(req->file->f_mode & FMODE_READ)))
                 return -EBADF;
  
-       if (!req->io)
+       /* either don't need iovec imported or already have it */
+       if (!req->io || req->flags & REQ_F_NEED_CLEANUP)
                 return 0;
  
         io = req->io;
@@ -2258,8 +2273,8 @@ copy_iov:
                 }
         }
  out_free:
-       if (!io_wq_current_is_worker())
-               kfree(iovec);
+       kfree(iovec);
+       req->flags &= ~REQ_F_NEED_CLEANUP;
         return ret;
  }
  
@@ -2277,7 +2292,8 @@ static int io_write_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe,
         if (unlikely(!(req->file->f_mode & FMODE_WRITE)))
                 return -EBADF;
  
-       if (!req->io)
+       /* either don't need iovec imported or already have it */
+       if (!req->io || req->flags & REQ_F_NEED_CLEANUP)
                 return 0;
  
         io = req->io;
@@ -2352,6 +2368,12 @@ static int io_write(struct io_kiocb *req, struct io_kiocb **nxt,
                         ret2 = call_write_iter(req->file, kiocb, &iter);
                 else
                         ret2 = loop_rw_iter(WRITE, req->file, kiocb, &iter);
+               /*
+                * Raw bdev writes will -EOPNOTSUPP for IOCB_NOWAIT. Just
+                * retry them without IOCB_NOWAIT.
+                */
+               if (ret2 == -EOPNOTSUPP && (kiocb->ki_flags & IOCB_NOWAIT))
+                       ret2 = -EAGAIN;
                 if (!force_nonblock || ret2 != -EAGAIN) {
                         kiocb_done(kiocb, ret2, nxt, req->in_async);
                 } else {
@@ -2364,8 +2386,8 @@ copy_iov:
                 }
         }
  out_free:
-       if (!io_wq_current_is_worker())
-               kfree(iovec);
+       req->flags &= ~REQ_F_NEED_CLEANUP;
+       kfree(iovec);
         return ret;
  }
  
@@ -2485,6 +2507,9 @@ static void io_fallocate_finish(struct io_wq_work **workptr)
         struct io_kiocb *nxt = NULL;
         int ret;
  
+       if (io_req_cancelled(req))
+               return;
+
         ret = vfs_fallocate(req->file, req->sync.mode, req->sync.off,
                                 req->sync.len);
         if (ret < 0)
@@ -2534,6 +2559,10 @@ static int io_openat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
  
         if (sqe->ioprio || sqe->buf_index)
                 return -EINVAL;
+       if (sqe->flags & IOSQE_FIXED_FILE)
+               return -EBADF;
+       if (req->flags & REQ_F_NEED_CLEANUP)
+               return 0;
  
         req->open.dfd = READ_ONCE(sqe->fd);
         req->open.how.mode = READ_ONCE(sqe->len);
@@ -2547,6 +2576,7 @@ static int io_openat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
                 return ret;
         }
  
+       req->flags |= REQ_F_NEED_CLEANUP;
         return 0;
  }
  
@@ -2559,6 +2589,10 @@ static int io_openat2_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
  
         if (sqe->ioprio || sqe->buf_index)
                 return -EINVAL;
+       if (sqe->flags & IOSQE_FIXED_FILE)
+               return -EBADF;
+       if (req->flags & REQ_F_NEED_CLEANUP)
+               return 0;
  
         req->open.dfd = READ_ONCE(sqe->fd);
         fname = u64_to_user_ptr(READ_ONCE(sqe->addr));
@@ -2583,6 +2617,7 @@ static int io_openat2_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
                 return ret;
         }
  
+       req->flags |= REQ_F_NEED_CLEANUP;
         return 0;
  }
  
@@ -2614,6 +2649,7 @@ static int io_openat2(struct io_kiocb *req, struct io_kiocb **nxt,
         }
  err:
         putname(req->open.filename);
+       req->flags &= ~REQ_F_NEED_CLEANUP;
         if (ret < 0)
                 req_set_fail_links(req);
         io_cqring_add_event(req, ret);
@@ -2754,6 +2790,10 @@ static int io_statx_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
  
         if (sqe->ioprio || sqe->buf_index)
                 return -EINVAL;
+       if (sqe->flags & IOSQE_FIXED_FILE)
+               return -EBADF;
+       if (req->flags & REQ_F_NEED_CLEANUP)
+               return 0;
  
         req->open.dfd = READ_ONCE(sqe->fd);
         req->open.mask = READ_ONCE(sqe->len);
@@ -2771,6 +2811,7 @@ static int io_statx_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
                 return ret;
         }
  
+       req->flags |= REQ_F_NEED_CLEANUP;
         return 0;
  }
  
@@ -2808,6 +2849,7 @@ retry:
                 ret = cp_statx(&stat, ctx->buffer);
  err:
         putname(ctx->filename);
+       req->flags &= ~REQ_F_NEED_CLEANUP;
         if (ret < 0)
                 req_set_fail_links(req);
         io_cqring_add_event(req, ret);
@@ -2827,7 +2869,7 @@ static int io_close_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
             sqe->rw_flags || sqe->buf_index)
                 return -EINVAL;
         if (sqe->flags & IOSQE_FIXED_FILE)
-               return -EINVAL;
+               return -EBADF;
  
         req->close.fd = READ_ONCE(sqe->fd);
         if (req->file->f_op == &io_uring_fops ||
@@ -2837,24 +2879,26 @@ static int io_close_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
         return 0;
  }
  
+/* only called when __close_fd_get_file() is done */
+static void __io_close_finish(struct io_kiocb *req, struct io_kiocb **nxt)
+{
+       int ret;
+
+       ret = filp_close(req->close.put_file, req->work.files);
+       if (ret < 0)
+               req_set_fail_links(req);
+       io_cqring_add_event(req, ret);
+       fput(req->close.put_file);
+       io_put_req_find_next(req, nxt);
+}
+
  static void io_close_finish(struct io_wq_work **workptr)
  {
         struct io_kiocb *req = container_of(*workptr, struct io_kiocb, work);
         struct io_kiocb *nxt = NULL;
  
-       /* Invoked with files, we need to do the close */
-       if (req->work.files) {
-               int ret;
-
-               ret = filp_close(req->close.put_file, req->work.files);
-               if (ret < 0)
-                       req_set_fail_links(req);
-               io_cqring_add_event(req, ret);
-       }
-
-       fput(req->close.put_file);
-
-       io_put_req_find_next(req, &nxt);
+       /* not cancellable, don't do io_req_cancelled() */
+       __io_close_finish(req, &nxt);
         if (nxt)
                 io_wq_assign_next(workptr, nxt);
  }
@@ -2877,22 +2921,8 @@ static int io_close(struct io_kiocb *req, struct io_kiocb **nxt,
          * No ->flush(), safely close from here and just punt the
          * fput() to async context.
          */
-       ret = filp_close(req->close.put_file, current->files);
-
-       if (ret < 0)
-               req_set_fail_links(req);
-       io_cqring_add_event(req, ret);
-
-       if (io_wq_current_is_worker()) {
-               struct io_wq_work *old_work, *work;
-
-               old_work = work = &req->work;
-               io_close_finish(&work);
-               if (work && work != old_work)
-                       *nxt = container_of(work, struct io_kiocb, work);
-               return 0;
-       }
-
+       __io_close_finish(req, nxt);
+       return 0;
  eagain:
         req->work.func = io_close_finish;
         /*
@@ -2960,35 +2990,34 @@ static int io_sync_file_range(struct io_kiocb *req, struct io_kiocb **nxt,
         return 0;
  }
  
-#if defined(CONFIG_NET)
-static void io_sendrecv_async(struct io_wq_work **workptr)
-{
-       struct io_kiocb *req = container_of(*workptr, struct io_kiocb, work);
-       struct iovec *iov = NULL;
-
-       if (req->io->rw.iov != req->io->rw.fast_iov)
-               iov = req->io->msg.iov;
-       io_wq_submit_work(workptr);
-       kfree(iov);
-}
-#endif
-
  static int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
  {
  #if defined(CONFIG_NET)
         struct io_sr_msg *sr = &req->sr_msg;
         struct io_async_ctx *io = req->io;
+       int ret;
  
         sr->msg_flags = READ_ONCE(sqe->msg_flags);
         sr->msg = u64_to_user_ptr(READ_ONCE(sqe->addr));
         sr->len = READ_ONCE(sqe->len);
  
+#ifdef CONFIG_COMPAT
+       if (req->ctx->compat)
+               sr->msg_flags |= MSG_CMSG_COMPAT;
+#endif
+
         if (!io || req->opcode == IORING_OP_SEND)
                 return 0;
+       /* iovec is already imported */
+       if (req->flags & REQ_F_NEED_CLEANUP)
+               return 0;
  
         io->msg.iov = io->msg.fast_iov;
-       return sendmsg_copy_msghdr(&io->msg.msg, sr->msg, sr->msg_flags,
+       ret = sendmsg_copy_msghdr(&io->msg.msg, sr->msg, sr->msg_flags,
                                         &io->msg.iov);
+       if (!ret)
+               req->flags |= REQ_F_NEED_CLEANUP;
+       return ret;
  #else
         return -EOPNOTSUPP;
  #endif
@@ -3008,12 +3037,11 @@ static int io_sendmsg(struct io_kiocb *req, struct io_kiocb **nxt,
         sock = sock_from_file(req->file, &ret);
         if (sock) {
                 struct io_async_ctx io;
-               struct sockaddr_storage addr;
                 unsigned flags;
  
                 if (req->io) {
                         kmsg = &req->io->msg;
-                       kmsg->msg.msg_name = &addr;
+                       kmsg->msg.msg_name = &req->io->msg.addr;
                         /* if iov is set, it's allocated already */
                         if (!kmsg->iov)
                                 kmsg->iov = kmsg->fast_iov;
@@ -3022,7 +3050,7 @@ static int io_sendmsg(struct io_kiocb *req, struct io_kiocb **nxt,
                         struct io_sr_msg *sr = &req->sr_msg;
  
                         kmsg = &io.msg;
-                       kmsg->msg.msg_name = &addr;
+                       kmsg->msg.msg_name = &io.msg.addr;
  
                         io.msg.iov = io.msg.fast_iov;
                         ret = sendmsg_copy_msghdr(&io.msg.msg, sr->msg,
@@ -3041,18 +3069,22 @@ static int io_sendmsg(struct io_kiocb *req, struct io_kiocb **nxt,
                 if (force_nonblock && ret == -EAGAIN) {
                         if (req->io)
                                 return -EAGAIN;
-                       if (io_alloc_async_ctx(req))
+                       if (io_alloc_async_ctx(req)) {
+                               if (kmsg->iov != kmsg->fast_iov)
+                                       kfree(kmsg->iov);
                                 return -ENOMEM;
+                       }
+                       req->flags |= REQ_F_NEED_CLEANUP;
                         memcpy(&req->io->msg, &io.msg, sizeof(io.msg));
-                       req->work.func = io_sendrecv_async;
                         return -EAGAIN;
                 }
                 if (ret == -ERESTARTSYS)
                         ret = -EINTR;
         }
  
-       if (!io_wq_current_is_worker() && kmsg && kmsg->iov != kmsg->fast_iov)
+       if (kmsg && kmsg->iov != kmsg->fast_iov)
                 kfree(kmsg->iov);
+       req->flags &= ~REQ_F_NEED_CLEANUP;
         io_cqring_add_event(req, ret);
         if (ret < 0)
                 req_set_fail_links(req);
@@ -3120,17 +3152,29 @@ static int io_recvmsg_prep(struct io_kiocb *req,
  #if defined(CONFIG_NET)
         struct io_sr_msg *sr = &req->sr_msg;
         struct io_async_ctx *io = req->io;
+       int ret;
  
         sr->msg_flags = READ_ONCE(sqe->msg_flags);
         sr->msg = u64_to_user_ptr(READ_ONCE(sqe->addr));
         sr->len = READ_ONCE(sqe->len);
  
+#ifdef CONFIG_COMPAT
+       if (req->ctx->compat)
+               sr->msg_flags |= MSG_CMSG_COMPAT;
+#endif
+
         if (!io || req->opcode == IORING_OP_RECV)
                 return 0;
+       /* iovec is already imported */
+       if (req->flags & REQ_F_NEED_CLEANUP)
+               return 0;
  
         io->msg.iov = io->msg.fast_iov;
-       return recvmsg_copy_msghdr(&io->msg.msg, sr->msg, sr->msg_flags,
+       ret = recvmsg_copy_msghdr(&io->msg.msg, sr->msg, sr->msg_flags,
                                         &io->msg.uaddr, &io->msg.iov);
+       if (!ret)
+               req->flags |= REQ_F_NEED_CLEANUP;
+       return ret;
  #else
         return -EOPNOTSUPP;
  #endif
@@ -3150,12 +3194,11 @@ static int io_recvmsg(struct io_kiocb *req, struct io_kiocb **nxt,
         sock = sock_from_file(req->file, &ret);
         if (sock) {
                 struct io_async_ctx io;
-               struct sockaddr_storage addr;
                 unsigned flags;
  
                 if (req->io) {
                         kmsg = &req->io->msg;
-                       kmsg->msg.msg_name = &addr;
+                       kmsg->msg.msg_name = &req->io->msg.addr;
                         /* if iov is set, it's allocated already */
                         if (!kmsg->iov)
                                 kmsg->iov = kmsg->fast_iov;
@@ -3164,7 +3207,7 @@ static int io_recvmsg(struct io_kiocb *req, struct io_kiocb **nxt,
                         struct io_sr_msg *sr = &req->sr_msg;
  
                         kmsg = &io.msg;
-                       kmsg->msg.msg_name = &addr;
+                       kmsg->msg.msg_name = &io.msg.addr;
  
                         io.msg.iov = io.msg.fast_iov;
                         ret = recvmsg_copy_msghdr(&io.msg.msg, sr->msg,
@@ -3185,18 +3228,22 @@ static int io_recvmsg(struct io_kiocb *req, struct io_kiocb **nxt,
                 if (force_nonblock && ret == -EAGAIN) {
                         if (req->io)
                                 return -EAGAIN;
-                       if (io_alloc_async_ctx(req))
+                       if (io_alloc_async_ctx(req)) {
+                               if (kmsg->iov != kmsg->fast_iov)
+                                       kfree(kmsg->iov);
                                 return -ENOMEM;
+                       }
                         memcpy(&req->io->msg, &io.msg, sizeof(io.msg));
-                       req->work.func = io_sendrecv_async;
+                       req->flags |= REQ_F_NEED_CLEANUP;
                         return -EAGAIN;
                 }
                 if (ret == -ERESTARTSYS)
                         ret = -EINTR;
         }
  
-       if (!io_wq_current_is_worker() && kmsg && kmsg->iov != kmsg->fast_iov)
+       if (kmsg && kmsg->iov != kmsg->fast_iov)
                 kfree(kmsg->iov);
+       req->flags &= ~REQ_F_NEED_CLEANUP;
         io_cqring_add_event(req, ret);
         if (ret < 0)
                 req_set_fail_links(req);
@@ -4207,6 +4254,35 @@ static int io_req_defer(struct io_kiocb *req, const struct io_uring_sqe *sqe)
         return -EIOCBQUEUED;
  }
  
+static void io_cleanup_req(struct io_kiocb *req)
+{
+       struct io_async_ctx *io = req->io;
+
+       switch (req->opcode) {
+       case IORING_OP_READV:
+       case IORING_OP_READ_FIXED:
+       case IORING_OP_READ:
+       case IORING_OP_WRITEV:
+       case IORING_OP_WRITE_FIXED:
+       case IORING_OP_WRITE:
+               if (io->rw.iov != io->rw.fast_iov)
+                       kfree(io->rw.iov);
+               break;
+       case IORING_OP_SENDMSG:
+       case IORING_OP_RECVMSG:
+               if (io->msg.iov != io->msg.fast_iov)
+                       kfree(io->msg.iov);
+               break;
+       case IORING_OP_OPENAT:
+       case IORING_OP_OPENAT2:
+       case IORING_OP_STATX:
+               putname(req->open.filename);
+               break;
+       }
+
+       req->flags &= ~REQ_F_NEED_CLEANUP;
+}
+
  static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
                         struct io_kiocb **nxt, bool force_nonblock)
  {
@@ -4446,7 +4522,6 @@ static void io_wq_submit_work(struct io_wq_work **workptr)
         }
  
         if (!ret) {
-               req->has_user = (work->flags & IO_WQ_WORK_HAS_MM) != 0;
                 req->in_async = true;
                 do {
                         ret = io_issue_sqe(req, NULL, &nxt, false);
@@ -4479,7 +4554,7 @@ static int io_req_needs_file(struct io_kiocb *req, int fd)
  {
         if (!io_op_defs[req->opcode].needs_file)
                 return 0;
-       if (fd == -1 && io_op_defs[req->opcode].fd_non_neg)
+       if ((fd == -1 || fd == AT_FDCWD) && io_op_defs[req->opcode].fd_non_neg)
                 return 0;
         return 1;
  }
@@ -4639,11 +4714,21 @@ static void __io_queue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe)
  {
         struct io_kiocb *linked_timeout;
         struct io_kiocb *nxt = NULL;
+       const struct cred *old_creds = NULL;
         int ret;
  
  again:
         linked_timeout = io_prep_linked_timeout(req);
  
+       if (req->work.creds && req->work.creds != current_cred()) {
+               if (old_creds)
+                       revert_creds(old_creds);
+               if (old_creds == req->work.creds)
+                       old_creds = NULL; /* restored original creds */
+               else
+                       old_creds = override_creds(req->work.creds);
+       }
+
         ret = io_issue_sqe(req, sqe, &nxt, true);
  
         /*
@@ -4669,7 +4754,7 @@ punt:
  
  err:
         /* drop submission reference */
-       io_put_req(req);
+       io_put_req_find_next(req, &nxt);
  
         if (linked_timeout) {
                 if (!ret)
@@ -4693,6 +4778,8 @@ done_req:
                         goto punt;
                 goto again;
         }
+       if (old_creds)
+               revert_creds(old_creds);
  }
  
  static void io_queue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe)
@@ -4737,7 +4824,6 @@ static inline void io_queue_link_head(struct io_kiocb *req)
  static bool io_submit_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
                           struct io_submit_state *state, struct io_kiocb **link)
  {
-       const struct cred *old_creds = NULL;
         struct io_ring_ctx *ctx = req->ctx;
         unsigned int sqe_flags;
         int ret, id;
@@ -4752,14 +4838,12 @@ static bool io_submit_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
  
         id = READ_ONCE(sqe->personality);
         if (id) {
-               const struct cred *personality_creds;
-
-               personality_creds = idr_find(&ctx->personality_idr, id);
-               if (unlikely(!personality_creds)) {
+               req->work.creds = idr_find(&ctx->personality_idr, id);
+               if (unlikely(!req->work.creds)) {
                         ret = -EINVAL;
                         goto err_req;
                 }
-               old_creds = override_creds(personality_creds);
+               get_cred(req->work.creds);
         }
  
         /* same numerical values with corresponding REQ_F_*, safe to copy */
@@ -4771,8 +4855,6 @@ static bool io_submit_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
  err_req:
                 io_cqring_add_event(req, ret);
                 io_double_put_req(req);
-               if (old_creds)
-                       revert_creds(old_creds);
                 return false;
         }
  
@@ -4833,8 +4915,6 @@ err_req:
                 }
         }
  
-       if (old_creds)
-               revert_creds(old_creds);
         return true;
  }
  
@@ -4950,6 +5030,7 @@ static int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr,
         for (i = 0; i < nr; i++) {
                 const struct io_uring_sqe *sqe;
                 struct io_kiocb *req;
+               int err;
  
                 req = io_get_req(ctx, statep);
                 if (unlikely(!req)) {
@@ -4966,20 +5047,23 @@ static int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr,
                 submitted++;
  
                 if (unlikely(req->opcode >= IORING_OP_LAST)) {
-                       io_cqring_add_event(req, -EINVAL);
+                       err = -EINVAL;
+fail_req:
+                       io_cqring_add_event(req, err);
                         io_double_put_req(req);
                         break;
                 }
  
                 if (io_op_defs[req->opcode].needs_mm && !*mm) {
                         mm_fault = mm_fault || !mmget_not_zero(ctx->sqo_mm);
-                       if (!mm_fault) {
-                               use_mm(ctx->sqo_mm);
-                               *mm = ctx->sqo_mm;
+                       if (unlikely(mm_fault)) {
+                               err = -EFAULT;
+                               goto fail_req;
                         }
+                       use_mm(ctx->sqo_mm);
+                       *mm = ctx->sqo_mm;
                 }
  
-               req->has_user = *mm != NULL;
                 req->in_async = async;
                 req->needs_fixed_file = async;
                 trace_io_uring_submit_sqe(ctx, req->opcode, req->user_data,
@@ -5011,9 +5095,8 @@ static int io_sq_thread(void *data)
         const struct cred *old_cred;
         mm_segment_t old_fs;
         DEFINE_WAIT(wait);
-       unsigned inflight;
         unsigned long timeout;
-       int ret;
+       int ret = 0;
  
         complete(&ctx->completions[1]);
  
@@ -5021,39 +5104,19 @@ static int io_sq_thread(void *data)
         set_fs(USER_DS);
         old_cred = override_creds(ctx->creds);
  
-       ret = timeout = inflight = 0;
+       timeout = jiffies + ctx->sq_thread_idle;
         while (!kthread_should_park()) {
                 unsigned int to_submit;
  
-               if (inflight) {
+               if (!list_empty(&ctx->poll_list)) {
                         unsigned nr_events = 0;
  
-                       if (ctx->flags & IORING_SETUP_IOPOLL) {
-                               /*
-                                * inflight is the count of the maximum possible
-                                * entries we submitted, but it can be smaller
-                                * if we dropped some of them. If we don't have
-                                * poll entries available, then we know that we
-                                * have nothing left to poll for. Reset the
-                                * inflight count to zero in that case.
-                                */
-                               mutex_lock(&ctx->uring_lock);
-                               if (!list_empty(&ctx->poll_list))
-                                       __io_iopoll_check(ctx, &nr_events, 0);
-                               else
-                                       inflight = 0;
-                               mutex_unlock(&ctx->uring_lock);
-                       } else {
-                               /*
-                                * Normal IO, just pretend everything completed.
-                                * We don't have to poll completions for that.
-                                */
-                               nr_events = inflight;
-                       }
-
-                       inflight -= nr_events;
-                       if (!inflight)
+                       mutex_lock(&ctx->uring_lock);
+                       if (!list_empty(&ctx->poll_list))
+                               io_iopoll_getevents(ctx, &nr_events, 0);
+                       else
                                 timeout = jiffies + ctx->sq_thread_idle;
+                       mutex_unlock(&ctx->uring_lock);
                 }
  
                 to_submit = io_sqring_entries(ctx);
@@ -5063,6 +5126,18 @@ static int io_sq_thread(void *data)
                  * to enter the kernel to reap and flush events.
                  */
                 if (!to_submit || ret == -EBUSY) {
+                       /*
+                        * Drop cur_mm before scheduling, we can't hold it for
+                        * long periods (or over schedule()). Do this before
+                        * adding ourselves to the waitqueue, as the unuse/drop
+                        * may sleep.
+                        */
+                       if (cur_mm) {
+                               unuse_mm(cur_mm);
+                               mmput(cur_mm);
+                               cur_mm = NULL;
+                       }
+
                         /*
                          * We're polling. If we're within the defined idle
                          * period, then let us spin without work before going
@@ -5070,28 +5145,29 @@ static int io_sq_thread(void *data)
                          * more IO, we should wait for the application to
                          * reap events and wake us up.
                          */
-                       if (inflight ||
+                       if (!list_empty(&ctx->poll_list) ||
                             (!time_after(jiffies, timeout) && ret != -EBUSY &&
                             !percpu_ref_is_dying(&ctx->refs))) {
                                 cond_resched();
                                 continue;
                         }
  
+                       prepare_to_wait(&ctx->sqo_wait, &wait,
+                                               TASK_INTERRUPTIBLE);
+
                         /*
-                        * Drop cur_mm before scheduling, we can't hold it for
-                        * long periods (or over schedule()). Do this before
-                        * adding ourselves to the waitqueue, as the unuse/drop
-                        * may sleep.
+                        * While doing polled IO, before going to sleep, we need
+                        * to check if there are new reqs added to poll_list, it
+                        * is because reqs may have been punted to io worker and
+                        * will be added to poll_list later, hence check the
+                        * poll_list again.
                          */
-                       if (cur_mm) {
-                               unuse_mm(cur_mm);
-                               mmput(cur_mm);
-                               cur_mm = NULL;
+                       if ((ctx->flags & IORING_SETUP_IOPOLL) &&
+                           !list_empty_careful(&ctx->poll_list)) {
+                               finish_wait(&ctx->sqo_wait, &wait);
+                               continue;
                         }
  
-                       prepare_to_wait(&ctx->sqo_wait, &wait,
-                                               TASK_INTERRUPTIBLE);
-
                         /* Tell userspace we may need a wakeup call */
                         ctx->rings->sq_flags |= IORING_SQ_NEED_WAKEUP;
                         /* make sure to read SQ tail after writing flags */
@@ -5119,8 +5195,7 @@ static int io_sq_thread(void *data)
                 mutex_lock(&ctx->uring_lock);
                 ret = io_submit_sqes(ctx, to_submit, NULL, -1, &cur_mm, true);
                 mutex_unlock(&ctx->uring_lock);
-               if (ret > 0)
-                       inflight += ret;
+               timeout = jiffies + ctx->sq_thread_idle;
         }
  
         set_fs(old_fs);
@@ -5525,7 +5600,6 @@ static void io_ring_file_ref_switch(struct work_struct *work)
  
         data = container_of(work, struct fixed_file_data, ref_work);
         io_ring_file_ref_flush(data);
-       percpu_ref_get(&data->refs);
         percpu_ref_switch_to_percpu(&data->refs);
  }
  
@@ -5701,8 +5775,13 @@ static void io_atomic_switch(struct percpu_ref *ref)
  {
         struct fixed_file_data *data;
  
+       /*
+        * Juggle reference to ensure we hit zero, if needed, so we can
+        * switch back to percpu mode
+        */
         data = container_of(ref, struct fixed_file_data, refs);
-       clear_bit(FFD_F_ATOMIC, &data->state);
+       percpu_ref_put(&data->refs);
+       percpu_ref_get(&data->refs);
  }
  
  static bool io_queue_file_removal(struct fixed_file_data *data,
@@ -5725,11 +5804,7 @@ static bool io_queue_file_removal(struct fixed_file_data *data,
         llist_add(&pfile->llist, &data->put_llist);
  
         if (pfile == &pfile_stack) {
-               if (!test_and_set_bit(FFD_F_ATOMIC, &data->state)) {
-                       percpu_ref_put(&data->refs);
-                       percpu_ref_switch_to_atomic(&data->refs,
-                                                       io_atomic_switch);
-               }
+               percpu_ref_switch_to_atomic(&data->refs, io_atomic_switch);
                 wait_for_completion(&done);
                 flush_work(&data->ref_work);
                 return false;
@@ -5803,10 +5878,8 @@ static int __io_sqe_files_update(struct io_ring_ctx *ctx,
                 up->offset++;
         }
  
-       if (ref_switch && !test_and_set_bit(FFD_F_ATOMIC, &data->state)) {
-               percpu_ref_put(&data->refs);
+       if (ref_switch)
                 percpu_ref_switch_to_atomic(&data->refs, io_atomic_switch);
-       }
  
         return done ? done : err;
  }
@@ -6264,6 +6337,7 @@ static void io_ring_ctx_free(struct io_ring_ctx *ctx)
         io_sqe_buffer_unregister(ctx);
         io_sqe_files_unregister(ctx);
         io_eventfd_unregister(ctx);
+       idr_destroy(&ctx->personality_idr);
  
  #if defined(CONFIG_UNIX)
         if (ctx->ring_sock) {
@@ -6301,7 +6375,7 @@ static __poll_t io_uring_poll(struct file *file, poll_table *wait)
         if (READ_ONCE(ctx->rings->sq.tail) - ctx->cached_sq_head !=
             ctx->rings->sq_ring_entries)
                 mask |= EPOLLOUT | EPOLLWRNORM;
-       if (READ_ONCE(ctx->rings->cq.head) != ctx->cached_cq_tail)
+       if (io_cqring_events(ctx, false))
                 mask |= EPOLLIN | EPOLLRDNORM;
  
         return mask;
@@ -6393,6 +6467,29 @@ static void io_uring_cancel_files(struct io_ring_ctx *ctx,
                 if (!cancel_req)
                         break;
  
+               if (cancel_req->flags & REQ_F_OVERFLOW) {
+                       spin_lock_irq(&ctx->completion_lock);
+                       list_del(&cancel_req->list);
+                       cancel_req->flags &= ~REQ_F_OVERFLOW;
+                       if (list_empty(&ctx->cq_overflow_list)) {
+                               clear_bit(0, &ctx->sq_check_overflow);
+                               clear_bit(0, &ctx->cq_check_overflow);
+                       }
+                       spin_unlock_irq(&ctx->completion_lock);
+
+                       WRITE_ONCE(ctx->rings->cq_overflow,
+                               atomic_inc_return(&ctx->cached_cq_overflow));
+
+                       /*
+                        * Put inflight ref and overflow ref. If that's
+                        * all we had, then we're done with this request.
+                        */
+                       if (refcount_sub_and_test(2, &cancel_req->refs)) {
+                               io_put_req(cancel_req);
+                               continue;
+                       }
+               }
+
                 io_wq_cancel_work(ctx->io_wq, &cancel_req->work);
                 io_put_req(cancel_req);
                 schedule();
@@ -6405,6 +6502,13 @@ static int io_uring_flush(struct file *file, void *data)
         struct io_ring_ctx *ctx = file->private_data;
  
         io_uring_cancel_files(ctx, data);
+
+       /*
+        * If the task is going away, cancel work it may have pending
+        */
+       if (fatal_signal_pending(current) || (current->flags & PF_EXITING))
+               io_wq_cancel_pid(ctx->io_wq, task_pid_vnr(current));
+
         return 0;
  }
  
@@ -6547,6 +6651,7 @@ out_fput:
         return submitted ? submitted : ret;
  }
  
+#ifdef CONFIG_PROC_FS
  static int io_uring_show_cred(int id, void *p, void *data)
  {
         const struct cred *cred = p;
@@ -6620,6 +6725,7 @@ static void io_uring_show_fdinfo(struct seq_file *m, struct file *f)
                 percpu_ref_put(&ctx->refs);
         }
  }
+#endif
  
  static const struct file_operations io_uring_fops = {
         .release        = io_uring_release,
@@ -6631,7 +6737,9 @@ static const struct file_operations io_uring_fops = {
  #endif
         .poll           = io_uring_poll,
         .fasync         = io_uring_fasync,
+#ifdef CONFIG_PROC_FS
         .show_fdinfo    = io_uring_show_fdinfo,
+#endif
  };
  
  static int io_allocate_scq_urings(struct io_ring_ctx *ctx,
diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c

index 2494095e0340b6c30c450a01c575304b3f4fa4f3..27373f5792a4f7a5a846742cb44e624ad56f5377 100644 (file)
--- a/fs/jbd2/commit.c
+++ b/fs/jbd2/commit.c
@@ -976,29 +976,33 @@ restart_loop:
                  * it. */
  
                 /*
-               * A buffer which has been freed while still being journaled by
-               * a previous transaction.
-               */
-               if (buffer_freed(bh)) {
+                * A buffer which has been freed while still being journaled
+                * by a previous transaction, refile the buffer to BJ_Forget of
+                * the running transaction. If the just committed transaction
+                * contains "add to orphan" operation, we can completely
+                * invalidate the buffer now. We are rather through in that
+                * since the buffer may be still accessible when blocksize <
+                * pagesize and it is attached to the last partial page.
+                */
+               if (buffer_freed(bh) && !jh->b_next_transaction) {
+                       struct address_space *mapping;
+
+                       clear_buffer_freed(bh);
+                       clear_buffer_jbddirty(bh);
+
                         /*
-                        * If the running transaction is the one containing
-                        * "add to orphan" operation (b_next_transaction !=
-                        * NULL), we have to wait for that transaction to
-                        * commit before we can really get rid of the buffer.
-                        * So just clear b_modified to not confuse transaction
-                        * credit accounting and refile the buffer to
-                        * BJ_Forget of the running transaction. If the just
-                        * committed transaction contains "add to orphan"
-                        * operation, we can completely invalidate the buffer
-                        * now. We are rather through in that since the
-                        * buffer may be still accessible when blocksize <
-                        * pagesize and it is attached to the last partial
-                        * page.
+                        * Block device buffers need to stay mapped all the
+                        * time, so it is enough to clear buffer_jbddirty and
+                        * buffer_freed bits. For the file mapping buffers (i.e.
+                        * journalled data) we need to unmap buffer and clear
+                        * more bits. We also need to be careful about the check
+                        * because the data page mapping can get cleared under
+                        * out hands, which alse need not to clear more bits
+                        * because the page and buffers will be freed and can
+                        * never be reused once we are done with them.
                          */
-                       jh->b_modified = 0;
-                       if (!jh->b_next_transaction) {
-                               clear_buffer_freed(bh);
-                               clear_buffer_jbddirty(bh);
+                       mapping = READ_ONCE(bh->b_page->mapping);
+                       if (mapping && !sb_is_blkdev_sb(mapping->host->i_sb)) {
                                 clear_buffer_mapped(bh);
                                 clear_buffer_new(bh);
                                 clear_buffer_req(bh);
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c

index e77a5a0b4e46e567336ab894308e1420499576fa..3dccc23cf0102337398c2c7ec99a31aa4e45b192 100644 (file)
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -936,8 +936,6 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
         char *frozen_buffer = NULL;
         unsigned long start_lock, time_lock;
  
-       if (is_handle_aborted(handle))
-               return -EROFS;
         journal = transaction->t_journal;
  
         jbd_debug(5, "journal_head %p, force_copy %d\n", jh, force_copy);
@@ -1152,8 +1150,8 @@ static bool jbd2_write_access_granted(handle_t *handle, struct buffer_head *bh,
         /* For undo access buffer must have data copied */
         if (undo && !jh->b_committed_data)
                 goto out;
-       if (jh->b_transaction != handle->h_transaction &&
-           jh->b_next_transaction != handle->h_transaction)
+       if (READ_ONCE(jh->b_transaction) != handle->h_transaction &&
+           READ_ONCE(jh->b_next_transaction) != handle->h_transaction)
                 goto out;
         /*
          * There are two reasons for the barrier here:
@@ -1189,6 +1187,9 @@ int jbd2_journal_get_write_access(handle_t *handle, struct buffer_head *bh)
         struct journal_head *jh;
         int rc;
  
+       if (is_handle_aborted(handle))
+               return -EROFS;
+
         if (jbd2_write_access_granted(handle, bh, false))
                 return 0;
  
@@ -1326,6 +1327,9 @@ int jbd2_journal_get_undo_access(handle_t *handle, struct buffer_head *bh)
         struct journal_head *jh;
         char *committed_data = NULL;
  
+       if (is_handle_aborted(handle))
+               return -EROFS;
+
         if (jbd2_write_access_granted(handle, bh, true))
                 return 0;
  
@@ -2329,14 +2333,16 @@ static int journal_unmap_buffer(journal_t *journal, struct buffer_head *bh,
                         return -EBUSY;
                 }
                 /*
-                * OK, buffer won't be reachable after truncate. We just set
-                * j_next_transaction to the running transaction (if there is
-                * one) and mark buffer as freed so that commit code knows it
-                * should clear dirty bits when it is done with the buffer.
+                * OK, buffer won't be reachable after truncate. We just clear
+                * b_modified to not confuse transaction credit accounting, and
+                * set j_next_transaction to the running transaction (if there
+                * is one) and mark buffer as freed so that commit code knows
+                * it should clear dirty bits when it is done with the buffer.
                  */
                 set_buffer_freed(bh);
                 if (journal->j_running_transaction && buffer_jbddirty(bh))
                         jh->b_next_transaction = journal->j_running_transaction;
+               jh->b_modified = 0;
                 spin_unlock(&journal->j_list_lock);
                 spin_unlock(&jh->b_state_lock);
                 write_unlock(&journal->j_state_lock);
@@ -2563,8 +2569,8 @@ bool __jbd2_journal_refile_buffer(struct journal_head *jh)
          * our jh reference and thus __jbd2_journal_file_buffer() must not
          * take a new one.
          */
-       jh->b_transaction = jh->b_next_transaction;
-       jh->b_next_transaction = NULL;
+       WRITE_ONCE(jh->b_transaction, jh->b_next_transaction);
+       WRITE_ONCE(jh->b_next_transaction, NULL);
         if (buffer_freed(bh))
                 jlist = BJ_Forget;
         else if (jh->b_modified)
diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c

index 4a841071d8a71dc3a04c89ee44606e6324787acd..1865322de142d0a656e13ac99106fa554bc3d95a 100644 (file)
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -42,13 +42,27 @@ static void nfs_mark_delegation_revoked(struct nfs_delegation *delegation)
         if (!test_and_set_bit(NFS_DELEGATION_REVOKED, &delegation->flags)) {
                 delegation->stateid.type = NFS4_INVALID_STATEID_TYPE;
                 atomic_long_dec(&nfs_active_delegations);
+               if (!test_bit(NFS_DELEGATION_RETURNING, &delegation->flags))
+                       nfs_clear_verifier_delegated(delegation->inode);
         }
  }
  
+static struct nfs_delegation *nfs_get_delegation(struct nfs_delegation *delegation)
+{
+       refcount_inc(&delegation->refcount);
+       return delegation;
+}
+
+static void nfs_put_delegation(struct nfs_delegation *delegation)
+{
+       if (refcount_dec_and_test(&delegation->refcount))
+               __nfs_free_delegation(delegation);
+}
+
  static void nfs_free_delegation(struct nfs_delegation *delegation)
  {
         nfs_mark_delegation_revoked(delegation);
-       __nfs_free_delegation(delegation);
+       nfs_put_delegation(delegation);
  }
  
  /**
@@ -241,13 +255,18 @@ void nfs_inode_reclaim_delegation(struct inode *inode, const struct cred *cred,
  
  static int nfs_do_return_delegation(struct inode *inode, struct nfs_delegation *delegation, int issync)
  {
+       const struct cred *cred;
         int res = 0;
  
-       if (!test_bit(NFS_DELEGATION_REVOKED, &delegation->flags))
-               res = nfs4_proc_delegreturn(inode,
-                               delegation->cred,
+       if (!test_bit(NFS_DELEGATION_REVOKED, &delegation->flags)) {
+               spin_lock(&delegation->lock);
+               cred = get_cred(delegation->cred);
+               spin_unlock(&delegation->lock);
+               res = nfs4_proc_delegreturn(inode, cred,
                                 &delegation->stateid,
                                 issync);
+               put_cred(cred);
+       }
         return res;
  }
  
@@ -273,9 +292,13 @@ nfs_start_delegation_return_locked(struct nfs_inode *nfsi)
         if (delegation == NULL)
                 goto out;
         spin_lock(&delegation->lock);
-       if (!test_and_set_bit(NFS_DELEGATION_RETURNING, &delegation->flags))
-               ret = delegation;
+       if (!test_and_set_bit(NFS_DELEGATION_RETURNING, &delegation->flags)) {
+               /* Refcount matched in nfs_end_delegation_return() */
+               ret = nfs_get_delegation(delegation);
+       }
         spin_unlock(&delegation->lock);
+       if (ret)
+               nfs_clear_verifier_delegated(&nfsi->vfs_inode);
  out:
         return ret;
  }
@@ -393,6 +416,7 @@ int nfs_inode_set_delegation(struct inode *inode, const struct cred *cred,
         if (delegation == NULL)
                 return -ENOMEM;
         nfs4_stateid_copy(&delegation->stateid, stateid);
+       refcount_set(&delegation->refcount, 1);
         delegation->type = type;
         delegation->pagemod_limit = pagemod_limit;
         delegation->change_attr = inode_peek_iversion_raw(inode);
@@ -492,6 +516,8 @@ static int nfs_end_delegation_return(struct inode *inode, struct nfs_delegation
  
         err = nfs_do_return_delegation(inode, delegation, issync);
  out:
+       /* Refcount matched in nfs_start_delegation_return_locked() */
+       nfs_put_delegation(delegation);
         return err;
  }
  
@@ -686,9 +712,12 @@ void nfs4_inode_return_delegation_on_close(struct inode *inode)
                     list_empty(&NFS_I(inode)->open_files) &&
                     !test_and_set_bit(NFS_DELEGATION_RETURNING, &delegation->flags)) {
                         clear_bit(NFS_DELEGATION_RETURN_IF_CLOSED, &delegation->flags);
-                       ret = delegation;
+                       /* Refcount matched in nfs_end_delegation_return() */
+                       ret = nfs_get_delegation(delegation);
                 }
                 spin_unlock(&delegation->lock);
+               if (ret)
+                       nfs_clear_verifier_delegated(inode);
         }
  out:
         rcu_read_unlock();
@@ -1088,10 +1117,11 @@ restart:
                         delegation = nfs_start_delegation_return_locked(NFS_I(inode));
                         rcu_read_unlock();
                         if (delegation != NULL) {
-                               delegation = nfs_detach_delegation(NFS_I(inode),
-                                       delegation, server);
-                               if (delegation != NULL)
+                               if (nfs_detach_delegation(NFS_I(inode), delegation,
+                                                       server) != NULL)
                                         nfs_free_delegation(delegation);
+                               /* Match nfs_start_delegation_return_locked */
+                               nfs_put_delegation(delegation);
                         }
                         iput(inode);
                         nfs_sb_deactive(server->super);
diff --git a/fs/nfs/delegation.h b/fs/nfs/delegation.h

index 31b84604d3836c44bca2377e454efba231eb3506..9b00a0b7f8321fab75eeb6b732e9578b54086010 100644 (file)
--- a/fs/nfs/delegation.h
+++ b/fs/nfs/delegation.h
@@ -22,6 +22,7 @@ struct nfs_delegation {
         unsigned long pagemod_limit;
         __u64 change_attr;
         unsigned long flags;
+       refcount_t refcount;
         spinlock_t lock;
         struct rcu_head rcu;
  };
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c

index 1320288ff9ec9c7d50d207908d77f3899c7def3b..193d6fb363b7434ae629fc48992812a198b47ec3 100644 (file)
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -155,6 +155,7 @@ typedef struct {
         loff_t          current_index;
         decode_dirent_t decode;
  
+       unsigned long   dir_verifier;
         unsigned long   timestamp;
         unsigned long   gencount;
         unsigned int    cache_entry_index;
@@ -353,6 +354,7 @@ int nfs_readdir_xdr_filler(struct page **pages, nfs_readdir_descriptor_t *desc,
   again:
         timestamp = jiffies;
         gencount = nfs_inc_attr_generation_counter();
+       desc->dir_verifier = nfs_save_change_attribute(inode);
         error = NFS_PROTO(inode)->readdir(file_dentry(file), cred, entry->cookie, pages,
                                           NFS_SERVER(inode)->dtsize, desc->plus);
         if (error < 0) {
@@ -455,13 +457,13 @@ void nfs_force_use_readdirplus(struct inode *dir)
  }
  
  static
-void nfs_prime_dcache(struct dentry *parent, struct nfs_entry *entry)
+void nfs_prime_dcache(struct dentry *parent, struct nfs_entry *entry,
+               unsigned long dir_verifier)
  {
         struct qstr filename = QSTR_INIT(entry->name, entry->len);
         DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq);
         struct dentry *dentry;
         struct dentry *alias;
-       struct inode *dir = d_inode(parent);
         struct inode *inode;
         int status;
  
@@ -500,7 +502,7 @@ again:
                 if (nfs_same_file(dentry, entry)) {
                         if (!entry->fh->size)
                                 goto out;
-                       nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
+                       nfs_set_verifier(dentry, dir_verifier);
                         status = nfs_refresh_inode(d_inode(dentry), entry->fattr);
                         if (!status)
                                 nfs_setsecurity(d_inode(dentry), entry->fattr, entry->label);
@@ -526,7 +528,7 @@ again:
                 dput(dentry);
                 dentry = alias;
         }
-       nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
+       nfs_set_verifier(dentry, dir_verifier);
  out:
         dput(dentry);
  }
@@ -564,7 +566,8 @@ int nfs_readdir_page_filler(nfs_readdir_descriptor_t *desc, struct nfs_entry *en
                 count++;
  
                 if (desc->plus)
-                       nfs_prime_dcache(file_dentry(desc->file), entry);
+                       nfs_prime_dcache(file_dentry(desc->file), entry,
+                                       desc->dir_verifier);
  
                 status = nfs_readdir_add_to_array(entry, page);
                 if (status != 0)
@@ -983,14 +986,113 @@ static int nfs_fsync_dir(struct file *filp, loff_t start, loff_t end,
   * full lookup on all child dentries of 'dir' whenever a change occurs
   * on the server that might have invalidated our dcache.
   *
+ * Note that we reserve bit '0' as a tag to let us know when a dentry
+ * was revalidated while holding a delegation on its inode.
+ *
   * The caller should be holding dir->i_lock
   */
  void nfs_force_lookup_revalidate(struct inode *dir)
  {
-       NFS_I(dir)->cache_change_attribute++;
+       NFS_I(dir)->cache_change_attribute += 2;
  }
  EXPORT_SYMBOL_GPL(nfs_force_lookup_revalidate);
  
+/**
+ * nfs_verify_change_attribute - Detects NFS remote directory changes
+ * @dir: pointer to parent directory inode
+ * @verf: previously saved change attribute
+ *
+ * Return "false" if the verifiers doesn't match the change attribute.
+ * This would usually indicate that the directory contents have changed on
+ * the server, and that any dentries need revalidating.
+ */
+static bool nfs_verify_change_attribute(struct inode *dir, unsigned long verf)
+{
+       return (verf & ~1UL) == nfs_save_change_attribute(dir);
+}
+
+static void nfs_set_verifier_delegated(unsigned long *verf)
+{
+       *verf |= 1UL;
+}
+
+#if IS_ENABLED(CONFIG_NFS_V4)
+static void nfs_unset_verifier_delegated(unsigned long *verf)
+{
+       *verf &= ~1UL;
+}
+#endif /* IS_ENABLED(CONFIG_NFS_V4) */
+
+static bool nfs_test_verifier_delegated(unsigned long verf)
+{
+       return verf & 1;
+}
+
+static bool nfs_verifier_is_delegated(struct dentry *dentry)
+{
+       return nfs_test_verifier_delegated(dentry->d_time);
+}
+
+static void nfs_set_verifier_locked(struct dentry *dentry, unsigned long verf)
+{
+       struct inode *inode = d_inode(dentry);
+
+       if (!nfs_verifier_is_delegated(dentry) &&
+           !nfs_verify_change_attribute(d_inode(dentry->d_parent), verf))
+               goto out;
+       if (inode && NFS_PROTO(inode)->have_delegation(inode, FMODE_READ))
+               nfs_set_verifier_delegated(&verf);
+out:
+       dentry->d_time = verf;
+}
+
+/**
+ * nfs_set_verifier - save a parent directory verifier in the dentry
+ * @dentry: pointer to dentry
+ * @verf: verifier to save
+ *
+ * Saves the parent directory verifier in @dentry. If the inode has
+ * a delegation, we also tag the dentry as having been revalidated
+ * while holding a delegation so that we know we don't have to
+ * look it up again after a directory change.
+ */
+void nfs_set_verifier(struct dentry *dentry, unsigned long verf)
+{
+
+       spin_lock(&dentry->d_lock);
+       nfs_set_verifier_locked(dentry, verf);
+       spin_unlock(&dentry->d_lock);
+}
+EXPORT_SYMBOL_GPL(nfs_set_verifier);
+
+#if IS_ENABLED(CONFIG_NFS_V4)
+/**
+ * nfs_clear_verifier_delegated - clear the dir verifier delegation tag
+ * @inode: pointer to inode
+ *
+ * Iterates through the dentries in the inode alias list and clears
+ * the tag used to indicate that the dentry has been revalidated
+ * while holding a delegation.
+ * This function is intended for use when the delegation is being
+ * returned or revoked.
+ */
+void nfs_clear_verifier_delegated(struct inode *inode)
+{
+       struct dentry *alias;
+
+       if (!inode)
+               return;
+       spin_lock(&inode->i_lock);
+       hlist_for_each_entry(alias, &inode->i_dentry, d_u.d_alias) {
+               spin_lock(&alias->d_lock);
+               nfs_unset_verifier_delegated(&alias->d_time);
+               spin_unlock(&alias->d_lock);
+       }
+       spin_unlock(&inode->i_lock);
+}
+EXPORT_SYMBOL_GPL(nfs_clear_verifier_delegated);
+#endif /* IS_ENABLED(CONFIG_NFS_V4) */
+
  /*
   * A check for whether or not the parent directory has changed.
   * In the case it has, we assume that the dentries are untrustworthy
@@ -1159,6 +1261,7 @@ nfs_lookup_revalidate_dentry(struct inode *dir, struct dentry *dentry,
         struct nfs_fh *fhandle;
         struct nfs_fattr *fattr;
         struct nfs4_label *label;
+       unsigned long dir_verifier;
         int ret;
  
         ret = -ENOMEM;
@@ -1168,6 +1271,7 @@ nfs_lookup_revalidate_dentry(struct inode *dir, struct dentry *dentry,
         if (fhandle == NULL || fattr == NULL || IS_ERR(label))
                 goto out;
  
+       dir_verifier = nfs_save_change_attribute(dir);
         ret = NFS_PROTO(dir)->lookup(dir, dentry, fhandle, fattr, label);
         if (ret < 0) {
                 switch (ret) {
@@ -1188,7 +1292,7 @@ nfs_lookup_revalidate_dentry(struct inode *dir, struct dentry *dentry,
                 goto out;
  
         nfs_setsecurity(inode, fattr, label);
-       nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
+       nfs_set_verifier(dentry, dir_verifier);
  
         /* set a readdirplus hint that we had a cache miss */
         nfs_force_use_readdirplus(dir);
@@ -1230,7 +1334,7 @@ nfs_do_lookup_revalidate(struct inode *dir, struct dentry *dentry,
                 goto out_bad;
         }
  
-       if (NFS_PROTO(dir)->have_delegation(inode, FMODE_READ))
+       if (nfs_verifier_is_delegated(dentry))
                 return nfs_lookup_revalidate_delegated(dir, dentry, inode);
  
         /* Force a full look up iff the parent directory has changed */
@@ -1415,6 +1519,7 @@ struct dentry *nfs_lookup(struct inode *dir, struct dentry * dentry, unsigned in
         struct nfs_fh *fhandle = NULL;
         struct nfs_fattr *fattr = NULL;
         struct nfs4_label *label = NULL;
+       unsigned long dir_verifier;
         int error;
  
         dfprintk(VFS, "NFS: lookup(%pd2)\n", dentry);
@@ -1440,6 +1545,7 @@ struct dentry *nfs_lookup(struct inode *dir, struct dentry * dentry, unsigned in
         if (IS_ERR(label))
                 goto out;
  
+       dir_verifier = nfs_save_change_attribute(dir);
         trace_nfs_lookup_enter(dir, dentry, flags);
         error = NFS_PROTO(dir)->lookup(dir, dentry, fhandle, fattr, label);
         if (error == -ENOENT)
@@ -1463,7 +1569,7 @@ no_entry:
                         goto out_label;
                 dentry = res;
         }
-       nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
+       nfs_set_verifier(dentry, dir_verifier);
  out_label:
         trace_nfs_lookup_exit(dir, dentry, flags, error);
         nfs4_label_free(label);
@@ -1668,7 +1774,7 @@ nfs4_do_lookup_revalidate(struct inode *dir, struct dentry *dentry,
         if (inode == NULL)
                 goto full_reval;
  
-       if (NFS_PROTO(dir)->have_delegation(inode, FMODE_READ))
+       if (nfs_verifier_is_delegated(dentry))
                 return nfs_lookup_revalidate_delegated(dir, dentry, inode);
  
         /* NFS only supports OPEN on regular files */
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c

index 1309e6f47f3d69e3d8f24d466a15814aa03fff28..11bf15800ac9974204491ab0f7570ca063dbddfb 100644 (file)
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -2114,6 +2114,7 @@ static void init_once(void *foo)
         init_rwsem(&nfsi->rmdir_sem);
         mutex_init(&nfsi->commit_mutex);
         nfs4_init_once(nfsi);
+       nfsi->cache_change_attribute = 0;
  }
  
  static int __init nfs_init_inodecache(void)
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c

index be4eb720d5b69304d969ffb116ceca0a5b0d0b4d..1297919e0fce3173c7dee30d26d68e00a25e4125 100644 (file)
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -87,7 +87,6 @@ nfs4_file_open(struct inode *inode, struct file *filp)
         if (inode != d_inode(dentry))
                 goto out_drop;
  
-       nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
         nfs_file_set_open_context(filp, ctx);
         nfs_fscache_open_file(inode, filp);
         err = 0;
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c

index 95d07a3dc5d1d8b350c1d017b509e710bd31120b..69b7ab7a58157f4d7b3e7787d4d2b1dc9750ff7d 100644 (file)
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2974,10 +2974,13 @@ static int _nfs4_open_and_get_state(struct nfs4_opendata *opendata,
         struct dentry *dentry;
         struct nfs4_state *state;
         fmode_t acc_mode = _nfs4_ctx_to_accessmode(ctx);
+       struct inode *dir = d_inode(opendata->dir);
+       unsigned long dir_verifier;
         unsigned int seq;
         int ret;
  
         seq = raw_seqcount_begin(&sp->so_reclaim_seqcount);
+       dir_verifier = nfs_save_change_attribute(dir);
  
         ret = _nfs4_proc_open(opendata, ctx);
         if (ret != 0)
@@ -3005,8 +3008,19 @@ static int _nfs4_open_and_get_state(struct nfs4_opendata *opendata,
                         dput(ctx->dentry);
                         ctx->dentry = dentry = alias;
                 }
-               nfs_set_verifier(dentry,
-                               nfs_save_change_attribute(d_inode(opendata->dir)));
+       }
+
+       switch(opendata->o_arg.claim) {
+       default:
+               break;
+       case NFS4_OPEN_CLAIM_NULL:
+       case NFS4_OPEN_CLAIM_DELEGATE_CUR:
+       case NFS4_OPEN_CLAIM_DELEGATE_PREV:
+               if (!opendata->rpc_done)
+                       break;
+               if (opendata->o_res.delegation_type != 0)
+                       dir_verifier = nfs_save_change_attribute(dir);
+               nfs_set_verifier(dentry, dir_verifier);
         }
  
         /* Parse layoutget results before we check for access */
@@ -5322,7 +5336,7 @@ static void nfs4_proc_write_setup(struct nfs_pgio_header *hdr,
         hdr->timestamp   = jiffies;
  
         msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_WRITE];
-       nfs4_init_sequence(&hdr->args.seq_args, &hdr->res.seq_res, 1, 0);
+       nfs4_init_sequence(&hdr->args.seq_args, &hdr->res.seq_res, 0, 0);
         nfs4_state_protect_write(server->nfs_client, clnt, msg, hdr);
  }
  
diff --git a/fs/pipe.c b/fs/pipe.c

index 5a34d6c22d4cecd530cdfd1486868d577f427d9f..2144507447c5ae493230b5354bf0a0cdcd5ece2d 100644 (file)
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -722,9 +722,10 @@ pipe_release(struct inode *inode, struct file *file)
         if (file->f_mode & FMODE_WRITE)
                 pipe->writers--;
  
-       if (pipe->readers || pipe->writers) {
-               wake_up_interruptible_sync_poll(&pipe->rd_wait, EPOLLIN | EPOLLRDNORM | EPOLLERR | EPOLLHUP);
-               wake_up_interruptible_sync_poll(&pipe->wr_wait, EPOLLOUT | EPOLLWRNORM | EPOLLERR | EPOLLHUP);
+       /* Was that the last reader or writer, but not the other side? */
+       if (!pipe->readers != !pipe->writers) {
+               wake_up_interruptible_all(&pipe->rd_wait);
+               wake_up_interruptible_all(&pipe->wr_wait);
                 kill_fasync(&pipe->fasync_readers, SIGIO, POLL_IN);
                 kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT);
         }
@@ -1026,8 +1027,8 @@ static int wait_for_partner(struct pipe_inode_info *pipe, unsigned int *cnt)
  
  static void wake_up_partner(struct pipe_inode_info *pipe)
  {
-       wake_up_interruptible(&pipe->rd_wait);
-       wake_up_interruptible(&pipe->wr_wait);
+       wake_up_interruptible_all(&pipe->rd_wait);
+       wake_up_interruptible_all(&pipe->wr_wait);
  }
  
  static int fifo_open(struct inode *inode, struct file *filp)
@@ -1144,7 +1145,7 @@ err_rd:
  
  err_wr:
         if (!--pipe->writers)
-               wake_up_interruptible(&pipe->rd_wait);
+               wake_up_interruptible_all(&pipe->rd_wait);
         ret = -ERESTARTSYS;
         goto err;
  
@@ -1271,8 +1272,9 @@ static long pipe_set_size(struct pipe_inode_info *pipe, unsigned long arg)
         pipe->max_usage = nr_slots;
         pipe->tail = tail;
         pipe->head = head;
-       wake_up_interruptible_all(&pipe->rd_wait);
-       wake_up_interruptible_all(&pipe->wr_wait);
+
+       /* This might have made more room for writers */
+       wake_up_interruptible(&pipe->wr_wait);
         return pipe->max_usage * PAGE_SIZE;
  
  out_revert_acct:
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c

index 3a688eb5c5ae4e07e70b78f58ce6261a22770340..58e937be24cee7ab151df55523ed29499938cd16 100644 (file)
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -587,7 +587,7 @@ xfs_dax_writepages(
  
         xfs_iflags_clear(ip, XFS_ITRUNCATED);
         return dax_writeback_mapping_range(mapping,
-                       xfs_inode_buftarg(ip)->bt_bdev, wbc);
+                       xfs_inode_buftarg(ip)->bt_daxdev, wbc);
  }
  
  STATIC sector_t
diff --git a/fs/zonefs/Kconfig b/fs/zonefs/Kconfig

index fb87ad372e297d19b954a7c8187c8657a0f223cb..ef2697b78820d4634f47a258313e7ddca43daa11 100644 (file)
--- a/fs/zonefs/Kconfig
+++ b/fs/zonefs/Kconfig
@@ -2,6 +2,7 @@ config ZONEFS_FS
         tristate "zonefs filesystem support"
         depends on BLOCK
         depends on BLK_DEV_ZONED
+       select FS_IOMAP
         help
           zonefs is a simple file system which exposes zones of a zoned block
           device (e.g. host-managed or host-aware SMR disk drives) as files.
diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c

index 8bc6ef82d693e06f0dc0da790db63cda8364f989..69aee3dfb6607814cb24d14ea2cdd08642a4c56a 100644 (file)
--- a/fs/zonefs/super.c
+++ b/fs/zonefs/super.c
@@ -601,13 +601,13 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from)
         ssize_t ret;
  
         /*
-        * For async direct IOs to sequential zone files, ignore IOCB_NOWAIT
+        * For async direct IOs to sequential zone files, refuse IOCB_NOWAIT
          * as this can cause write reordering (e.g. the first aio gets EAGAIN
          * on the inode lock but the second goes through but is now unaligned).
          */
-       if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && !is_sync_kiocb(iocb)
-           && (iocb->ki_flags & IOCB_NOWAIT))
-               iocb->ki_flags &= ~IOCB_NOWAIT;
+       if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && !is_sync_kiocb(iocb) &&
+           (iocb->ki_flags & IOCB_NOWAIT))
+               return -EOPNOTSUPP;
  
         if (iocb->ki_flags & IOCB_NOWAIT) {
                 if (!inode_trylock(inode))
diff --git a/include/acpi/acpixf.h b/include/acpi/acpixf.h

index 00994b1b8681a32f25b3dfaa0e9d4b9755b9f445..8e8be989c2a6f56e9d503580493b8ddb8b7593a5 100644 (file)
--- a/include/acpi/acpixf.h
+++ b/include/acpi/acpixf.h
@@ -752,6 +752,8 @@ ACPI_HW_DEPENDENT_RETURN_UINT32(u32 acpi_dispatch_gpe(acpi_handle gpe_device, u3
  ACPI_HW_DEPENDENT_RETURN_STATUS(acpi_status acpi_disable_all_gpes(void))
  ACPI_HW_DEPENDENT_RETURN_STATUS(acpi_status acpi_enable_all_runtime_gpes(void))
  ACPI_HW_DEPENDENT_RETURN_STATUS(acpi_status acpi_enable_all_wakeup_gpes(void))
+ACPI_HW_DEPENDENT_RETURN_UINT32(u32 acpi_any_gpe_status_set(void))
+ACPI_HW_DEPENDENT_RETURN_UINT32(u32 acpi_any_fixed_event_status_set(void))
  
  ACPI_HW_DEPENDENT_RETURN_STATUS(acpi_status
                                 acpi_get_gpe_device(u32 gpe_index,
diff --git a/include/acpi/actypes.h b/include/acpi/actypes.h

index a2583c2bc0548c8da7ec53f6c535c636661e8ced..4defed58ea338fd61e789751af92c5a04b04088a 100644 (file)
--- a/include/acpi/actypes.h
+++ b/include/acpi/actypes.h
@@ -532,11 +532,12 @@ typedef u64 acpi_integer;
          strnlen (a, ACPI_NAMESEG_SIZE) == ACPI_NAMESEG_SIZE)
  
  /*
- * Algorithm to obtain access bit width.
+ * Algorithm to obtain access bit or byte width.
   * Can be used with access_width of struct acpi_generic_address and access_size of
   * struct acpi_resource_generic_register.
   */
  #define ACPI_ACCESS_BIT_WIDTH(size)     (1 << ((size) + 2))
+#define ACPI_ACCESS_BYTE_WIDTH(size)    (1 << ((size) - 1))
  
  /*******************************************************************************
   *
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h

index 053ea4b519887eaf7b61baee51657eb0ea2930f1..10455b2bbbb4a18f69e58621964354824b3f1787 100644 (file)
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -524,7 +524,7 @@ struct request_queue {
         unsigned int            sg_reserved_size;
         int                     node;
  #ifdef CONFIG_BLK_DEV_IO_TRACE
-       struct blk_trace        *blk_trace;
+       struct blk_trace __rcu  *blk_trace;
         struct mutex            blk_trace_mutex;
  #endif
         /*
diff --git a/include/linux/blktrace_api.h b/include/linux/blktrace_api.h

index 7bb2d8de9f308e367bcda4a5484f67483f8fc8e5..3b6ff5902edce65c9ec1dc1ccb0497b0ac521194 100644 (file)
--- a/include/linux/blktrace_api.h
+++ b/include/linux/blktrace_api.h
@@ -51,9 +51,13 @@ void __trace_note_message(struct blk_trace *, struct blkcg *blkcg, const char *f
   **/
  #define blk_add_cgroup_trace_msg(q, cg, fmt, ...)                      \
         do {                                                            \
-               struct blk_trace *bt = (q)->blk_trace;                  \
+               struct blk_trace *bt;                                   \
+                                                                       \
+               rcu_read_lock();                                        \
+               bt = rcu_dereference((q)->blk_trace);                   \
                 if (unlikely(bt))                                       \
                         __trace_note_message(bt, cg, fmt, ##__VA_ARGS__);\
+               rcu_read_unlock();                                      \
         } while (0)
  #define blk_add_trace_msg(q, fmt, ...)                                 \
         blk_add_cgroup_trace_msg(q, NULL, fmt, ##__VA_ARGS__)
@@ -61,10 +65,14 @@ void __trace_note_message(struct blk_trace *, struct blkcg *blkcg, const char *f
  
  static inline bool blk_trace_note_message_enabled(struct request_queue *q)
  {
-       struct blk_trace *bt = q->blk_trace;
-       if (likely(!bt))
-               return false;
-       return bt->act_mask & BLK_TC_NOTIFY;
+       struct blk_trace *bt;
+       bool ret;
+
+       rcu_read_lock();
+       bt = rcu_dereference(q->blk_trace);
+       ret = bt && (bt->act_mask & BLK_TC_NOTIFY);
+       rcu_read_unlock();
+       return ret;
  }
  
  extern void blk_add_driver_data(struct request_queue *q, struct request *rq,
diff --git a/include/linux/bootconfig.h b/include/linux/bootconfig.h

index 7e18c939663e71309ef058b27097658a11cd97c3..d11e183fcb542101417180b12e8cce8a84b8b0df 100644 (file)
--- a/include/linux/bootconfig.h
+++ b/include/linux/bootconfig.h
@@ -10,6 +10,9 @@
  #include <linux/kernel.h>
  #include <linux/types.h>
  
+#define BOOTCONFIG_MAGIC       "#BOOTCONFIG\n"
+#define BOOTCONFIG_MAGIC_LEN   12
+
  /* XBC tree node */
  struct xbc_node {
         u16 next;
diff --git a/include/linux/compat.h b/include/linux/compat.h

index 11083d84eb23ead8fcc62ca6277b9c904fe9bb6f..df2475be134aa5e53c4daa28fd819741282a7c99 100644 (file)
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -248,15 +248,6 @@ typedef struct compat_siginfo {
         } _sifields;
  } compat_siginfo_t;
  
-/*
- * These functions operate on 32- or 64-bit specs depending on
- * COMPAT_USE_64BIT_TIME, hence the void user pointer arguments.
- */
-extern int compat_get_timespec(struct timespec *, const void __user *);
-extern int compat_put_timespec(const struct timespec *, void __user *);
-extern int compat_get_timeval(struct timeval *, const void __user *);
-extern int compat_put_timeval(const struct timeval *, void __user *);
-
  struct compat_iovec {
         compat_uptr_t   iov_base;
         compat_size_t   iov_len;
@@ -416,26 +407,6 @@ int copy_siginfo_to_user32(struct compat_siginfo __user *to, const kernel_siginf
  int get_compat_sigevent(struct sigevent *event,
                 const struct compat_sigevent __user *u_event);
  
-static inline int old_timeval32_compare(struct old_timeval32 *lhs,
-                                       struct old_timeval32 *rhs)
-{
-       if (lhs->tv_sec < rhs->tv_sec)
-               return -1;
-       if (lhs->tv_sec > rhs->tv_sec)
-               return 1;
-       return lhs->tv_usec - rhs->tv_usec;
-}
-
-static inline int old_timespec32_compare(struct old_timespec32 *lhs,
-                                       struct old_timespec32 *rhs)
-{
-       if (lhs->tv_sec < rhs->tv_sec)
-               return -1;
-       if (lhs->tv_sec > rhs->tv_sec)
-               return 1;
-       return lhs->tv_nsec - rhs->tv_nsec;
-}
-
  extern int get_compat_sigset(sigset_t *set, const compat_sigset_t __user *compat);
  
  /*
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h

index 018dce868de630f0f6b2168ee8d3d095916ae121..0fb561d1b524eecc489c77e4a673df41c70f19d0 100644 (file)
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -201,9 +201,6 @@ static inline bool policy_is_shared(struct cpufreq_policy *policy)
         return cpumask_weight(policy->cpus) > 1;
  }
  
-/* /sys/devices/system/cpu/cpufreq: entry point for global variables */
-extern struct kobject *cpufreq_global_kobject;
-
  #ifdef CONFIG_CPU_FREQ
  unsigned int cpufreq_get(unsigned int cpu);
  unsigned int cpufreq_quick_get(unsigned int cpu);
diff --git a/include/linux/dax.h b/include/linux/dax.h

index 9bd8528bd305f16c388975a5e2ee37df83e662ee..328c2dbb4409ce2fd983bf38597563a11e116990 100644 (file)
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -129,11 +129,6 @@ static inline bool generic_fsdax_supported(struct dax_device *dax_dev,
                         sectors);
  }
  
-static inline struct dax_device *fs_dax_get_by_host(const char *host)
-{
-       return dax_get_by_host(host);
-}
-
  static inline void fs_put_dax(struct dax_device *dax_dev)
  {
         put_dax(dax_dev);
@@ -141,7 +136,7 @@ static inline void fs_put_dax(struct dax_device *dax_dev)
  
  struct dax_device *fs_dax_get_by_bdev(struct block_device *bdev);
  int dax_writeback_mapping_range(struct address_space *mapping,
-               struct block_device *bdev, struct writeback_control *wbc);
+               struct dax_device *dax_dev, struct writeback_control *wbc);
  
  struct page *dax_layout_busy_page(struct address_space *mapping);
  dax_entry_t dax_lock_page(struct page *page);
@@ -160,11 +155,6 @@ static inline bool generic_fsdax_supported(struct dax_device *dax_dev,
         return false;
  }
  
-static inline struct dax_device *fs_dax_get_by_host(const char *host)
-{
-       return NULL;
-}
-
  static inline void fs_put_dax(struct dax_device *dax_dev)
  {
  }
@@ -180,7 +170,7 @@ static inline struct page *dax_layout_busy_page(struct address_space *mapping)
  }
  
  static inline int dax_writeback_mapping_range(struct address_space *mapping,
-               struct block_device *bdev, struct writeback_control *wbc)
+               struct dax_device *dax_dev, struct writeback_control *wbc)
  {
         return -EOPNOTSUPP;
  }
diff --git a/include/linux/hid.h b/include/linux/hid.h

index cd41f209043f6dabe779040724883e14b75129c6..875f71132b1425b494ba9eb538f29543cb857753 100644 (file)
--- a/include/linux/hid.h
+++ b/include/linux/hid.h
@@ -492,7 +492,7 @@ struct hid_report_enum {
  };
  
  #define HID_MIN_BUFFER_SIZE    64              /* make sure there is at least a packet size of space */
-#define HID_MAX_BUFFER_SIZE    4096            /* 4kb */
+#define HID_MAX_BUFFER_SIZE    8192            /* 8kb */
  #define HID_CONTROL_FIFO_SIZE  256             /* to init devices with >100 reports */
  #define HID_OUTPUT_FIFO_SIZE   64
  
diff --git a/include/linux/icmpv6.h b/include/linux/icmpv6.h

index ef1cbb5f454f7aa105db534ecc1ddbb56df333ce..33d37960231441d63a1d7a3d611da916734fc2cd 100644 (file)
--- a/include/linux/icmpv6.h
+++ b/include/linux/icmpv6.h
@@ -22,12 +22,22 @@ extern int inet6_unregister_icmp_sender(ip6_icmp_send_t *fn);
  int ip6_err_gen_icmpv6_unreach(struct sk_buff *skb, int nhs, int type,
                                unsigned int data_len);
  
+#if IS_ENABLED(CONFIG_NF_NAT)
+void icmpv6_ndo_send(struct sk_buff *skb_in, u8 type, u8 code, __u32 info);
+#else
+#define icmpv6_ndo_send icmpv6_send
+#endif
+
  #else
  
  static inline void icmpv6_send(struct sk_buff *skb,
                                u8 type, u8 code, __u32 info)
  {
+}
  
+static inline void icmpv6_ndo_send(struct sk_buff *skb,
+                                  u8 type, u8 code, __u32 info)
+{
  }
  #endif
  
diff --git a/include/linux/intel-svm.h b/include/linux/intel-svm.h

index 94f047a8a845424f5ea34a73f8adb5b2ef5d57b9..d7c403d0dd27d84ecdc8d8b094739e32b85968e2 100644 (file)
--- a/include/linux/intel-svm.h
+++ b/include/linux/intel-svm.h
@@ -122,7 +122,7 @@ static inline int intel_svm_unbind_mm(struct device *dev, int pasid)
         BUG();
  }
  
-static int intel_svm_is_pasid_valid(struct device *dev, int pasid)
+static inline int intel_svm_is_pasid_valid(struct device *dev, int pasid)
  {
         return -EINVAL;
  }
diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h

index b2d47571ab676fcf81a301f1df091ad33f1eb7b6..8d062e86d954e11ef5fb46a8b00bb12287b6ad8b 100644 (file)
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -192,7 +192,7 @@ enum {
         IRQ_DOMAIN_FLAG_HIERARCHY       = (1 << 0),
  
         /* Irq domain name was allocated in __irq_domain_add() */
-       IRQ_DOMAIN_NAME_ALLOCATED       = (1 << 6),
+       IRQ_DOMAIN_NAME_ALLOCATED       = (1 << 1),
  
         /* Irq domain is an IPI domain with virq per cpu */
         IRQ_DOMAIN_FLAG_IPI_PER_CPU     = (1 << 2),
diff --git a/include/linux/ktime.h b/include/linux/ktime.h

index b2bb44f87f5a3edb6ee6f179c83fcde42a363fe5..d1fb05135665b7df5add8c3b2cb2d690fade6b7e 100644 (file)
--- a/include/linux/ktime.h
+++ b/include/linux/ktime.h
@@ -66,33 +66,15 @@ static inline ktime_t ktime_set(const s64 secs, const unsigned long nsecs)
   */
  #define ktime_sub_ns(kt, nsval)                ((kt) - (nsval))
  
-/* convert a timespec to ktime_t format: */
-static inline ktime_t timespec_to_ktime(struct timespec ts)
-{
-       return ktime_set(ts.tv_sec, ts.tv_nsec);
-}
-
  /* convert a timespec64 to ktime_t format: */
  static inline ktime_t timespec64_to_ktime(struct timespec64 ts)
  {
         return ktime_set(ts.tv_sec, ts.tv_nsec);
  }
  
-/* convert a timeval to ktime_t format: */
-static inline ktime_t timeval_to_ktime(struct timeval tv)
-{
-       return ktime_set(tv.tv_sec, tv.tv_usec * NSEC_PER_USEC);
-}
-
-/* Map the ktime_t to timespec conversion to ns_to_timespec function */
-#define ktime_to_timespec(kt)          ns_to_timespec((kt))
-
  /* Map the ktime_t to timespec conversion to ns_to_timespec function */
  #define ktime_to_timespec64(kt)                ns_to_timespec64((kt))
  
-/* Map the ktime_t to timeval conversion to ns_to_timeval function */
-#define ktime_to_timeval(kt)           ns_to_timeval((kt))
-
  /* Convert ktime_t to nanoseconds */
  static inline s64 ktime_to_ns(const ktime_t kt)
  {
@@ -215,25 +197,6 @@ static inline ktime_t ktime_sub_ms(const ktime_t kt, const u64 msec)
  
  extern ktime_t ktime_add_safe(const ktime_t lhs, const ktime_t rhs);
  
-/**
- * ktime_to_timespec_cond - convert a ktime_t variable to timespec
- *                         format only if the variable contains data
- * @kt:                the ktime_t variable to convert
- * @ts:                the timespec variable to store the result in
- *
- * Return: %true if there was a successful conversion, %false if kt was 0.
- */
-static inline __must_check bool ktime_to_timespec_cond(const ktime_t kt,
-                                                      struct timespec *ts)
-{
-       if (kt) {
-               *ts = ktime_to_timespec(kt);
-               return true;
-       } else {
-               return false;
-       }
-}
-
  /**
   * ktime_to_timespec64_cond - convert a ktime_t variable to timespec64
   *                         format only if the variable contains data
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h

index e89eb67356cb39b225cbfff924c1a41e49197006..bcb9b2ac0791dc341fea7d3623778351bf34ffd8 100644 (file)
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -889,6 +889,8 @@ int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu);
  bool kvm_arch_vcpu_in_kernel(struct kvm_vcpu *vcpu);
  int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu);
  bool kvm_arch_dy_runnable(struct kvm_vcpu *vcpu);
+int kvm_arch_post_init_vm(struct kvm *kvm);
+void kvm_arch_pre_destroy_vm(struct kvm *kvm);
  
  #ifndef __KVM_HAVE_ARCH_VM_ALLOC
  /*
@@ -1342,7 +1344,7 @@ static inline void kvm_vcpu_set_dy_eligible(struct kvm_vcpu *vcpu, bool val)
  #endif /* CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT */
  
  struct kvm_vcpu *kvm_get_running_vcpu(void);
-struct kvm_vcpu __percpu **kvm_get_running_vcpus(void);
+struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
  
  #ifdef CONFIG_HAVE_KVM_IRQ_BYPASS
  bool kvm_arch_has_irq_bypass(void);
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h

index ff8c9d527bb415d23d1fa6556a25e894145434b6..bfdf41537cf1facc8d24a2556e43df994c6832a4 100644 (file)
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -688,7 +688,10 @@ struct mlx5_ifc_flow_table_nic_cap_bits {
         u8         nic_rx_multi_path_tirs[0x1];
         u8         nic_rx_multi_path_tirs_fts[0x1];
         u8         allow_sniffer_and_nic_rx_shared_tir[0x1];
-       u8         reserved_at_3[0x1d];
+       u8         reserved_at_3[0x4];
+       u8         sw_owner_reformat_supported[0x1];
+       u8         reserved_at_8[0x18];
+
         u8         encap_general_header[0x1];
         u8         reserved_at_21[0xa];
         u8         log_max_packet_reformat_context[0x5];
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h

index a9c6b5c61d2719616fbc5c45cf37a2d967379ba4..6c3f7032e8d9d720aba0022662e72eccf6f1da2f 100644 (file)
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -72,6 +72,8 @@ void netdev_set_default_ethtool_ops(struct net_device *dev,
  #define NET_RX_SUCCESS         0       /* keep 'em coming, baby */
  #define NET_RX_DROP            1       /* packet dropped */
  
+#define MAX_NEST_DEV 8
+
  /*
   * Transmit return codes: transmit return codes originate from three different
   * namespaces:
@@ -1616,6 +1618,7 @@ enum netdev_priv_flags {
   *                             and drivers will need to set them appropriately.
   *
   *     @mpls_features: Mask of features inheritable by MPLS
+ *     @gso_partial_features: value(s) from NETIF_F_GSO\*
   *
   *     @ifindex:       interface index
   *     @group:         The group the device belongs to
@@ -1640,8 +1643,11 @@ enum netdev_priv_flags {
   *     @netdev_ops:    Includes several pointers to callbacks,
   *                     if one wants to override the ndo_*() functions
   *     @ethtool_ops:   Management operations
+ *     @l3mdev_ops:    Layer 3 master device operations
   *     @ndisc_ops:     Includes callbacks for different IPv6 neighbour
   *                     discovery handling. Necessary for e.g. 6LoWPAN.
+ *     @xfrmdev_ops:   Transformation offload operations
+ *     @tlsdev_ops:    Transport Layer Security offload operations
   *     @header_ops:    Includes callbacks for creating,parsing,caching,etc
   *                     of Layer 2 headers.
   *
@@ -1680,6 +1686,7 @@ enum netdev_priv_flags {
   *     @dev_port:              Used to differentiate devices that share
   *                             the same function
   *     @addr_list_lock:        XXX: need comments on this one
+ *     @name_assign_type:      network interface name assignment type
   *     @uc_promisc:            Counter that indicates promiscuous mode
   *                             has been enabled due to the need to listen to
   *                             additional unicast addresses in a device that
@@ -1702,6 +1709,9 @@ enum netdev_priv_flags {
   *     @ip6_ptr:       IPv6 specific data
   *     @ax25_ptr:      AX.25 specific data
   *     @ieee80211_ptr: IEEE 802.11 specific data, assign before registering
+ *     @ieee802154_ptr: IEEE 802.15.4 low-rate Wireless Personal Area Network
+ *                      device struct
+ *     @mpls_ptr:      mpls_dev struct pointer
   *
   *     @dev_addr:      Hw address (before bcast,
   *                     because most packets are unicast)
@@ -1710,6 +1720,8 @@ enum netdev_priv_flags {
   *     @num_rx_queues:         Number of RX queues
   *                             allocated at register_netdev() time
   *     @real_num_rx_queues:    Number of RX queues currently active in device
+ *     @xdp_prog:              XDP sockets filter program pointer
+ *     @gro_flush_timeout:     timeout for GRO layer in NAPI
   *
   *     @rx_handler:            handler for received packets
   *     @rx_handler_data:       XXX: need comments on this one
@@ -1731,10 +1743,14 @@ enum netdev_priv_flags {
   *     @qdisc:                 Root qdisc from userspace point of view
   *     @tx_queue_len:          Max frames per queue allowed
   *     @tx_global_lock:        XXX: need comments on this one
+ *     @xdp_bulkq:             XDP device bulk queue
+ *     @xps_cpus_map:          all CPUs map for XPS device
+ *     @xps_rxqs_map:          all RXQs map for XPS device
   *
   *     @xps_maps:      XXX: need comments on this one
   *     @miniq_egress:          clsact qdisc specific data for
   *                             egress processing
+ *     @qdisc_hash:            qdisc hash table
   *     @watchdog_timeo:        Represents the timeout that is used by
   *                             the watchdog (see dev_watchdog())
   *     @watchdog_timer:        List of timers
@@ -3548,7 +3564,7 @@ static inline unsigned int netif_attrmask_next(int n, const unsigned long *srcp,
  }
  
  /**
- *     netif_attrmask_next_and - get the next CPU/Rx queue in *src1p & *src2p
+ *     netif_attrmask_next_and - get the next CPU/Rx queue in \*src1p & \*src2p
   *     @n: CPU/Rx queue index
   *     @src1p: the first CPUs/Rx queues mask pointer
   *     @src2p: the second CPUs/Rx queues mask pointer
@@ -4375,11 +4391,8 @@ void *netdev_lower_get_next(struct net_device *dev,
              ldev; \
              ldev = netdev_lower_get_next(dev, &(iter)))
  
-struct net_device *netdev_all_lower_get_next(struct net_device *dev,
+struct net_device *netdev_next_lower_dev_rcu(struct net_device *dev,
                                              struct list_head **iter);
-struct net_device *netdev_all_lower_get_next_rcu(struct net_device *dev,
-                                                struct list_head **iter);
-
  int netdev_walk_all_lower_dev(struct net_device *dev,
                               int (*fn)(struct net_device *lower_dev,
                                         void *data),
diff --git a/include/linux/netfilter/ipset/ip_set.h b/include/linux/netfilter/ipset/ip_set.h

index 908d38dbcb91f0b1e7040b230c2506fb3f67f14e..5448c8b443dbf52aeac423ee987d19bb63b23e76 100644 (file)
--- a/include/linux/netfilter/ipset/ip_set.h
+++ b/include/linux/netfilter/ipset/ip_set.h
@@ -121,6 +121,7 @@ struct ip_set_ext {
         u32 timeout;
         u8 packets_op;
         u8 bytes_op;
+       bool target;
  };
  
  struct ip_set;
@@ -187,6 +188,14 @@ struct ip_set_type_variant {
         /* Return true if "b" set is the same as "a"
          * according to the create set parameters */
         bool (*same_set)(const struct ip_set *a, const struct ip_set *b);
+       /* Region-locking is used */
+       bool region_lock;
+};
+
+struct ip_set_region {
+       spinlock_t lock;        /* Region lock */
+       size_t ext_size;        /* Size of the dynamic extensions */
+       u32 elements;           /* Number of elements vs timeout */
  };
  
  /* The core set type structure */
@@ -501,7 +510,7 @@ ip_set_init_skbinfo(struct ip_set_skbinfo *skbinfo,
  }
  
  #define IP_SET_INIT_KEXT(skb, opt, set)                        \
-       { .bytes = (skb)->len, .packets = 1,            \
+       { .bytes = (skb)->len, .packets = 1, .target = true,\
           .timeout = ip_set_adt_opt_timeout(opt, set) }
  
  #define IP_SET_INIT_UEXT(set)                          \
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h

index a5f8f03ecd59e1f6dce820247cd18b69a157be11..5d5b91e54f736b33722ffd553985471579e3be76 100644 (file)
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -337,35 +337,17 @@ static inline int nfs_server_capable(struct inode *inode, int cap)
         return NFS_SERVER(inode)->caps & cap;
  }
  
-static inline void nfs_set_verifier(struct dentry * dentry, unsigned long verf)
-{
-       dentry->d_time = verf;
-}
-
  /**
   * nfs_save_change_attribute - Returns the inode attribute change cookie
   * @dir - pointer to parent directory inode
- * The "change attribute" is updated every time we finish an operation
- * that will result in a metadata change on the server.
+ * The "cache change attribute" is updated when we need to revalidate
+ * our dentry cache after a directory was seen to change on the server.
   */
  static inline unsigned long nfs_save_change_attribute(struct inode *dir)
  {
         return NFS_I(dir)->cache_change_attribute;
  }
  
-/**
- * nfs_verify_change_attribute - Detects NFS remote directory changes
- * @dir - pointer to parent directory inode
- * @chattr - previously saved change attribute
- * Return "false" if the verifiers doesn't match the change attribute.
- * This would usually indicate that the directory contents have changed on
- * the server, and that any dentries need revalidating.
- */
-static inline int nfs_verify_change_attribute(struct inode *dir, unsigned long chattr)
-{
-       return chattr == NFS_I(dir)->cache_change_attribute;
-}
-
  /*
   * linux/fs/nfs/inode.c
   */
@@ -495,6 +477,10 @@ extern const struct file_operations nfs_dir_operations;
  extern const struct dentry_operations nfs_dentry_operations;
  
  extern void nfs_force_lookup_revalidate(struct inode *dir);
+extern void nfs_set_verifier(struct dentry * dentry, unsigned long verf);
+#if IS_ENABLED(CONFIG_NFS_V4)
+extern void nfs_clear_verifier_delegated(struct inode *inode);
+#endif /* IS_ENABLED(CONFIG_NFS_V4) */
  extern struct dentry *nfs_add_or_obtain(struct dentry *dentry,
                         struct nfs_fh *fh, struct nfs_fattr *fattr,
                         struct nfs4_label *label);
diff --git a/include/linux/pipe_fs_i.h b/include/linux/pipe_fs_i.h

index d5765039652a5c1e94ba7d3a20795473b4711a4d..ae58fad7f1e0d8a96274cd71c77b727259a20971 100644 (file)
--- a/include/linux/pipe_fs_i.h
+++ b/include/linux/pipe_fs_i.h
@@ -29,7 +29,8 @@ struct pipe_buffer {
  /**
   *     struct pipe_inode_info - a linux kernel pipe
   *     @mutex: mutex protecting the whole thing
- *     @wait: reader/writer wait point in case of empty/full pipe
+ *     @rd_wait: reader wait point in case of empty pipe
+ *     @wr_wait: writer wait point in case of full pipe
   *     @head: The point of buffer production
   *     @tail: The point of buffer consumption
   *     @max_usage: The maximum number of slots that may be used in the ring
diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h

index e5b752027a031b119ce09a982b93f935c736c93d..9670b54b484a6f7849a15e458a7f63246f425b2b 100644 (file)
--- a/include/linux/rculist_nulls.h
+++ b/include/linux/rculist_nulls.h
@@ -145,6 +145,13 @@ static inline void hlist_nulls_add_tail_rcu(struct hlist_nulls_node *n,
         }
  }
  
+/* after that hlist_nulls_del will work */
+static inline void hlist_nulls_add_fake(struct hlist_nulls_node *n)
+{
+       n->pprev = &n->next;
+       n->next = (struct hlist_nulls_node *)NULLS_MARKER(NULL);
+}
+
  /**
   * hlist_nulls_for_each_entry_rcu - iterate over rcu list of given type
   * @tpos:      the type * to use as a loop cursor.
diff --git a/include/linux/sched/nohz.h b/include/linux/sched/nohz.h

index 1abe91ff6e4a2091f97fc903e56c96c9de78fb92..6d67e9a5af6bb48a77c14b85c09496181520914d 100644 (file)
--- a/include/linux/sched/nohz.h
+++ b/include/linux/sched/nohz.h
@@ -15,9 +15,11 @@ static inline void nohz_balance_enter_idle(int cpu) { }
  
  #ifdef CONFIG_NO_HZ_COMMON
  void calc_load_nohz_start(void);
+void calc_load_nohz_remote(struct rq *rq);
  void calc_load_nohz_stop(void);
  #else
  static inline void calc_load_nohz_start(void) { }
+static inline void calc_load_nohz_remote(struct rq *rq) { }
  static inline void calc_load_nohz_stop(void) { }
  #endif /* CONFIG_NO_HZ_COMMON */
  
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h

index ca8806b693882e9ae933d70d0bb6c4cbd3ca81bd..5b50278c4bc852b7659f9af8e392a45630111d0b 100644 (file)
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -611,9 +611,15 @@ typedef unsigned char *sk_buff_data_t;
   *     @next: Next buffer in list
   *     @prev: Previous buffer in list
   *     @tstamp: Time we arrived/left
+ *     @skb_mstamp_ns: (aka @tstamp) earliest departure time; start point
+ *             for retransmit timer
   *     @rbnode: RB tree node, alternative to next/prev for netem/tcp
+ *     @list: queue head
   *     @sk: Socket we are owned by
+ *     @ip_defrag_offset: (aka @sk) alternate use of @sk, used in
+ *             fragmentation management
   *     @dev: Device we arrived on/are leaving by
+ *     @dev_scratch: (aka @dev) alternate use of @dev when @dev would be %NULL
   *     @cb: Control buffer. Free for use by every layer. Put private vars here
   *     @_skb_refdst: destination entry (with norefcount bit)
   *     @sp: the security path, used for xfrm
@@ -632,6 +638,9 @@ typedef unsigned char *sk_buff_data_t;
   *     @pkt_type: Packet class
   *     @fclone: skbuff clone status
   *     @ipvs_property: skbuff is owned by ipvs
+ *     @inner_protocol_type: whether the inner protocol is
+ *             ENCAP_TYPE_ETHER or ENCAP_TYPE_IPPROTO
+ *     @remcsum_offload: remote checksum offload is enabled
   *     @offload_fwd_mark: Packet was L2-forwarded in hardware
   *     @offload_l3_fwd_mark: Packet was L3-forwarded in hardware
   *     @tc_skip_classify: do not classify packet. set by IFB device
@@ -650,6 +659,8 @@ typedef unsigned char *sk_buff_data_t;
   *     @tc_index: Traffic control index
   *     @hash: the packet hash
   *     @queue_mapping: Queue mapping for multiqueue devices
+ *     @head_frag: skb was allocated from page fragments,
+ *             not allocated by kmalloc() or vmalloc().
   *     @pfmemalloc: skbuff was allocated from PFMEMALLOC reserves
   *     @active_extensions: active extensions (skb_ext_id types)
   *     @ndisc_nodetype: router type (from link layer)
@@ -660,15 +671,28 @@ typedef unsigned char *sk_buff_data_t;
   *     @wifi_acked_valid: wifi_acked was set
   *     @wifi_acked: whether frame was acked on wifi or not
   *     @no_fcs:  Request NIC to treat last 4 bytes as Ethernet FCS
+ *     @encapsulation: indicates the inner headers in the skbuff are valid
+ *     @encap_hdr_csum: software checksum is needed
+ *     @csum_valid: checksum is already valid
   *     @csum_not_inet: use CRC32c to resolve CHECKSUM_PARTIAL
+ *     @csum_complete_sw: checksum was completed by software
+ *     @csum_level: indicates the number of consecutive checksums found in
+ *             the packet minus one that have been verified as
+ *             CHECKSUM_UNNECESSARY (max 3)
   *     @dst_pending_confirm: need to confirm neighbour
   *     @decrypted: Decrypted SKB
   *     @napi_id: id of the NAPI struct this skb came from
+ *     @sender_cpu: (aka @napi_id) source CPU in XPS
   *     @secmark: security marking
   *     @mark: Generic packet mark
+ *     @reserved_tailroom: (aka @mark) number of bytes of free space available
+ *             at the tail of an sk_buff
+ *     @vlan_present: VLAN tag is present
   *     @vlan_proto: vlan encapsulation protocol
   *     @vlan_tci: vlan tag control information
   *     @inner_protocol: Protocol (encapsulation)
+ *     @inner_ipproto: (aka @inner_protocol) stores ipproto when
+ *             skb->inner_protocol_type == ENCAP_TYPE_IPPROTO;
   *     @inner_transport_header: Inner transport layer header (encapsulation)
   *     @inner_network_header: Network layer header (encapsulation)
   *     @inner_mac_header: Link layer header (encapsulation)
@@ -750,7 +774,9 @@ struct sk_buff {
  #endif
  #define CLONED_OFFSET()                offsetof(struct sk_buff, __cloned_offset)
  
+       /* private: */
         __u8                    __cloned_offset[0];
+       /* public: */
         __u8                    cloned:1,
                                 nohdr:1,
                                 fclone:2,
@@ -775,7 +801,9 @@ struct sk_buff {
  #endif
  #define PKT_TYPE_OFFSET()      offsetof(struct sk_buff, __pkt_type_offset)
  
+       /* private: */
         __u8                    __pkt_type_offset[0];
+       /* public: */
         __u8                    pkt_type:3;
         __u8                    ignore_df:1;
         __u8                    nf_trace:1;
@@ -798,7 +826,9 @@ struct sk_buff {
  #define PKT_VLAN_PRESENT_BIT   0
  #endif
  #define PKT_VLAN_PRESENT_OFFSET()      offsetof(struct sk_buff, __pkt_vlan_present_offset)
+       /* private: */
         __u8                    __pkt_vlan_present_offset[0];
+       /* public: */
         __u8                    vlan_present:1;
         __u8                    csum_complete_sw:1;
         __u8                    csum_level:2;
diff --git a/include/linux/suspend.h b/include/linux/suspend.h

index 4a230c2f1c317ab87c725afee60cb2a3e275a308..2b2055b035eee13b44b8021f540424758f3a52eb 100644 (file)
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -191,7 +191,7 @@ struct platform_s2idle_ops {
         int (*begin)(void);
         int (*prepare)(void);
         int (*prepare_late)(void);
-       void (*wake)(void);
+       bool (*wake)(void);
         void (*restore_early)(void);
         void (*restore)(void);
         void (*end)(void);
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h

index cde3dc18e21a2cdee5ef36183d180946e45be0ea..046bb94bd4d61da998a809a82eebdc22691c0ee7 100644 (file)
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -64,6 +64,9 @@ extern void swiotlb_tbl_sync_single(struct device *hwdev,
                                     size_t size, enum dma_data_direction dir,
                                     enum dma_sync_target target);
  
+dma_addr_t swiotlb_map(struct device *dev, phys_addr_t phys,
+               size_t size, enum dma_data_direction dir, unsigned long attrs);
+
  #ifdef CONFIG_SWIOTLB
  extern enum swiotlb_force swiotlb_force;
  extern phys_addr_t io_tlb_start, io_tlb_end;
@@ -73,8 +76,6 @@ static inline bool is_swiotlb_buffer(phys_addr_t paddr)
         return paddr >= io_tlb_start && paddr < io_tlb_end;
  }
  
-bool swiotlb_map(struct device *dev, phys_addr_t *phys, dma_addr_t *dma_addr,
-               size_t size, enum dma_data_direction dir, unsigned long attrs);
  void __init swiotlb_exit(void);
  unsigned int swiotlb_max_segment(void);
  size_t swiotlb_max_mapping_size(struct device *dev);
@@ -85,12 +86,6 @@ static inline bool is_swiotlb_buffer(phys_addr_t paddr)
  {
         return false;
  }
-static inline bool swiotlb_map(struct device *dev, phys_addr_t *phys,
-               dma_addr_t *dma_addr, size_t size, enum dma_data_direction dir,
-               unsigned long attrs)
-{
-       return false;
-}
  static inline void swiotlb_exit(void)
  {
  }
diff --git a/include/linux/time32.h b/include/linux/time32.h

index cad4c318600213430212947cbec7a63859654f3c..cf9320cd2d0bda2ab247e234dbdd53a778eeec24 100644 (file)
--- a/include/linux/time32.h
+++ b/include/linux/time32.h
@@ -12,8 +12,6 @@
  #include <linux/time64.h>
  #include <linux/timex.h>
  
-#define TIME_T_MAX     (__kernel_old_time_t)((1UL << ((sizeof(__kernel_old_time_t) << 3) - 1)) - 1)
-
  typedef s32            old_time32_t;
  
  struct old_timespec32 {
@@ -73,162 +71,12 @@ struct __kernel_timex;
  int get_old_timex32(struct __kernel_timex *, const struct old_timex32 __user *);
  int put_old_timex32(struct old_timex32 __user *, const struct __kernel_timex *);
  
-#if __BITS_PER_LONG == 64
-
-/* timespec64 is defined as timespec here */
-static inline struct timespec timespec64_to_timespec(const struct timespec64 ts64)
-{
-       return *(const struct timespec *)&ts64;
-}
-
-static inline struct timespec64 timespec_to_timespec64(const struct timespec ts)
-{
-       return *(const struct timespec64 *)&ts;
-}
-
-#else
-static inline struct timespec timespec64_to_timespec(const struct timespec64 ts64)
-{
-       struct timespec ret;
-
-       ret.tv_sec = (time_t)ts64.tv_sec;
-       ret.tv_nsec = ts64.tv_nsec;
-       return ret;
-}
-
-static inline struct timespec64 timespec_to_timespec64(const struct timespec ts)
-{
-       struct timespec64 ret;
-
-       ret.tv_sec = ts.tv_sec;
-       ret.tv_nsec = ts.tv_nsec;
-       return ret;
-}
-#endif
-
-static inline int timespec_equal(const struct timespec *a,
-                                const struct timespec *b)
-{
-       return (a->tv_sec == b->tv_sec) && (a->tv_nsec == b->tv_nsec);
-}
-
-/*
- * lhs < rhs:  return <0
- * lhs == rhs: return 0
- * lhs > rhs:  return >0
- */
-static inline int timespec_compare(const struct timespec *lhs, const struct timespec *rhs)
-{
-       if (lhs->tv_sec < rhs->tv_sec)
-               return -1;
-       if (lhs->tv_sec > rhs->tv_sec)
-               return 1;
-       return lhs->tv_nsec - rhs->tv_nsec;
-}
-
-/*
- * Returns true if the timespec is norm, false if denorm:
- */
-static inline bool timespec_valid(const struct timespec *ts)
-{
-       /* Dates before 1970 are bogus */
-       if (ts->tv_sec < 0)
-               return false;
-       /* Can't have more nanoseconds then a second */
-       if ((unsigned long)ts->tv_nsec >= NSEC_PER_SEC)
-               return false;
-       return true;
-}
-
-/**
- * timespec_to_ns - Convert timespec to nanoseconds
- * @ts:                pointer to the timespec variable to be converted
- *
- * Returns the scalar nanosecond representation of the timespec
- * parameter.
- */
-static inline s64 timespec_to_ns(const struct timespec *ts)
-{
-       return ((s64) ts->tv_sec * NSEC_PER_SEC) + ts->tv_nsec;
-}
-
  /**
- * ns_to_timespec - Convert nanoseconds to timespec
- * @nsec:      the nanoseconds value to be converted
- *
- * Returns the timespec representation of the nsec parameter.
- */
-extern struct timespec ns_to_timespec(const s64 nsec);
-
-/**
- * timespec_add_ns - Adds nanoseconds to a timespec
- * @a:         pointer to timespec to be incremented
- * @ns:                unsigned nanoseconds value to be added
- *
- * This must always be inlined because its used from the x86-64 vdso,
- * which cannot call other kernel functions.
- */
-static __always_inline void timespec_add_ns(struct timespec *a, u64 ns)
-{
-       a->tv_sec += __iter_div_u64_rem(a->tv_nsec + ns, NSEC_PER_SEC, &ns);
-       a->tv_nsec = ns;
-}
-
-static inline unsigned long mktime(const unsigned int year,
-                       const unsigned int mon, const unsigned int day,
-                       const unsigned int hour, const unsigned int min,
-                       const unsigned int sec)
-{
-       return mktime64(year, mon, day, hour, min, sec);
-}
-
-static inline bool timeval_valid(const struct timeval *tv)
-{
-       /* Dates before 1970 are bogus */
-       if (tv->tv_sec < 0)
-               return false;
-
-       /* Can't have more microseconds then a second */
-       if (tv->tv_usec < 0 || tv->tv_usec >= USEC_PER_SEC)
-               return false;
-
-       return true;
-}
-
-/**
- * timeval_to_ns - Convert timeval to nanoseconds
- * @ts:                pointer to the timeval variable to be converted
- *
- * Returns the scalar nanosecond representation of the timeval
- * parameter.
- */
-static inline s64 timeval_to_ns(const struct timeval *tv)
-{
-       return ((s64) tv->tv_sec * NSEC_PER_SEC) +
-               tv->tv_usec * NSEC_PER_USEC;
-}
-
-/**
- * ns_to_timeval - Convert nanoseconds to timeval
+ * ns_to_kernel_old_timeval - Convert nanoseconds to timeval
   * @nsec:      the nanoseconds value to be converted
   *
   * Returns the timeval representation of the nsec parameter.
   */
-extern struct timeval ns_to_timeval(const s64 nsec);
  extern struct __kernel_old_timeval ns_to_kernel_old_timeval(s64 nsec);
  
-/*
- * Old names for the 32-bit time_t interfaces, these will be removed
- * when everything uses the new names.
- */
-#define compat_time_t          old_time32_t
-#define compat_timeval         old_timeval32
-#define compat_timespec                old_timespec32
-#define compat_itimerspec      old_itimerspec32
-#define ns_to_compat_timeval   ns_to_old_timeval32
-#define get_compat_itimerspec64        get_old_itimerspec32
-#define put_compat_itimerspec64        put_old_itimerspec32
-#define compat_get_timespec64  get_old_timespec32
-#define compat_put_timespec64  put_old_timespec32
-
  #endif
diff --git a/include/linux/timekeeping32.h b/include/linux/timekeeping32.h

index cc59cc9e0e841dc8980882837afb03a5ae10a51e..266017fc9ee9c194bd5013161a442007e23cd9cf 100644 (file)
--- a/include/linux/timekeeping32.h
+++ b/include/linux/timekeeping32.h
@@ -11,36 +11,4 @@ static inline unsigned long get_seconds(void)
         return ktime_get_real_seconds();
  }
  
-static inline void getnstimeofday(struct timespec *ts)
-{
-       struct timespec64 ts64;
-
-       ktime_get_real_ts64(&ts64);
-       *ts = timespec64_to_timespec(ts64);
-}
-
-static inline void ktime_get_ts(struct timespec *ts)
-{
-       struct timespec64 ts64;
-
-       ktime_get_ts64(&ts64);
-       *ts = timespec64_to_timespec(ts64);
-}
-
-static inline void getrawmonotonic(struct timespec *ts)
-{
-       struct timespec64 ts64;
-
-       ktime_get_raw_ts64(&ts64);
-       *ts = timespec64_to_timespec(ts64);
-}
-
-static inline void getboottime(struct timespec *ts)
-{
-       struct timespec64 ts64;
-
-       getboottime64(&ts64);
-       *ts = timespec64_to_timespec(ts64);
-}
-
  #endif
diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h

index af2c85d3a1dde028e7fad140da14ef62f56a6127..6c7a10a6d71e59ead3e12003399b4f97993ebc6e 100644 (file)
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -440,7 +440,7 @@ struct synth_event_trace_state {
         struct synth_event *event;
         unsigned int cur_field;
         unsigned int n_u64;
-       bool enabled;
+       bool disabled;
         bool add_next;
         bool add_name;
  };
diff --git a/include/linux/tty.h b/include/linux/tty.h

index bfa4e2ee94a9de340321ecdd4141272643ca9115..bd5fe0e907e8c1c219a496120d49a0fa42b8c736 100644 (file)
--- a/include/linux/tty.h
+++ b/include/linux/tty.h
@@ -225,6 +225,8 @@ struct tty_port_client_operations {
         void (*write_wakeup)(struct tty_port *port);
  };
  
+extern const struct tty_port_client_operations tty_port_default_client_ops;
+
  struct tty_port {
         struct tty_bufhead      buf;            /* Locked internally */
         struct tty_struct       *tty;           /* Back pointer */
diff --git a/include/linux/types.h b/include/linux/types.h

index eb870ad42919de7fb1e6a41ded184460b0cb0b81..d3021c87917953be758958fdf0fef4731702224c 100644 (file)
--- a/include/linux/types.h
+++ b/include/linux/types.h
@@ -65,11 +65,6 @@ typedef __kernel_ssize_t     ssize_t;
  typedef __kernel_ptrdiff_t     ptrdiff_t;
  #endif
  
-#ifndef _TIME_T
-#define _TIME_T
-typedef __kernel_old_time_t    time_t;
-#endif
-
  #ifndef _CLOCK_T
  #define _CLOCK_T
  typedef __kernel_clock_t       clock_t;
diff --git a/include/linux/usb/quirks.h b/include/linux/usb/quirks.h

index a1be64c9940fb4ad4e2365f226419f050d6addba..22c1f579afe3020b22ed36a40523ab34c08a3292 100644 (file)
--- a/include/linux/usb/quirks.h
+++ b/include/linux/usb/quirks.h
@@ -69,4 +69,7 @@
  /* Hub needs extra delay after resetting its port. */
  #define USB_QUIRK_HUB_SLOW_RESET               BIT(14)
  
+/* device has blacklisted endpoints */
+#define USB_QUIRK_ENDPOINT_BLACKLIST           BIT(15)
+
  #endif /* __LINUX_USB_QUIRKS_H */
diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h

index d93017a7ce5c147a3efc3083ba37f50a619ab73a..62838391582760a6e4101a2c9dde0b3dd9922218 100644 (file)
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -5,6 +5,7 @@
  #include <linux/types.h>
  #include <linux/in6.h>
  #include <linux/siphash.h>
+#include <linux/string.h>
  #include <uapi/linux/if_ether.h>
  
  struct sk_buff;
@@ -33,7 +34,6 @@ enum flow_dissect_ret {
  
  /**
   * struct flow_dissector_key_basic:
- * @thoff: Transport header offset
   * @n_proto: Network header protocol (eg. IPv4/IPv6)
   * @ip_proto: Transport header protocol (eg. TCP/UDP)
   */
@@ -349,4 +349,12 @@ struct bpf_flow_dissector {
         void                    *data_end;
  };
  
+static inline void
+flow_dissector_init_keys(struct flow_dissector_key_control *key_control,
+                        struct flow_dissector_key_basic *key_basic)
+{
+       memset(key_control, 0, sizeof(*key_control));
+       memset(key_basic, 0, sizeof(*key_basic));
+}
+
  #endif
diff --git a/include/net/icmp.h b/include/net/icmp.h

index 5d4bfdba9adf03759affb01617245a290e339a8f..9ac2d2672a938626b28ccbd7e8d5095bcf2910f5 100644 (file)
--- a/include/net/icmp.h
+++ b/include/net/icmp.h
@@ -43,6 +43,12 @@ static inline void icmp_send(struct sk_buff *skb_in, int type, int code, __be32
         __icmp_send(skb_in, type, code, info, &IPCB(skb_in)->opt);
  }
  
+#if IS_ENABLED(CONFIG_NF_NAT)
+void icmp_ndo_send(struct sk_buff *skb_in, int type, int code, __be32 info);
+#else
+#define icmp_ndo_send icmp_send
+#endif
+
  int icmp_rcv(struct sk_buff *skb);
  int icmp_err(struct sk_buff *skb, u32 info);
  int icmp_init(void);
diff --git a/include/net/mac80211.h b/include/net/mac80211.h

index aa145808e57a2377994ec4b9b0c858dbd98fbdd5..77e6b5a83b065fff4e80cec8dd86227d248e4a0b 100644 (file)
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -1004,12 +1004,11 @@ ieee80211_rate_get_vht_nss(const struct ieee80211_tx_rate *rate)
  struct ieee80211_tx_info {
         /* common information */
         u32 flags;
-       u8 band;
-
-       u8 hw_queue;
-
-       u16 ack_frame_id:6;
-       u16 tx_time_est:10;
+       u32 band:3,
+           ack_frame_id:13,
+           hw_queue:4,
+           tx_time_est:10;
+       /* 2 free bits */
  
         union {
                 struct {
diff --git a/include/net/sock.h b/include/net/sock.h

index 02162b0378f73f9221aec78e7adedd7124ef652b..328564525526914f9d7aed9216a2fd5f8c255af4 100644 (file)
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -117,19 +117,26 @@ typedef __u64 __bitwise __addrpair;
   *     struct sock_common - minimal network layer representation of sockets
   *     @skc_daddr: Foreign IPv4 addr
   *     @skc_rcv_saddr: Bound local IPv4 addr
+ *     @skc_addrpair: 8-byte-aligned __u64 union of @skc_daddr & @skc_rcv_saddr
   *     @skc_hash: hash value used with various protocol lookup tables
   *     @skc_u16hashes: two u16 hash values used by UDP lookup tables
   *     @skc_dport: placeholder for inet_dport/tw_dport
   *     @skc_num: placeholder for inet_num/tw_num
+ *     @skc_portpair: __u32 union of @skc_dport & @skc_num
   *     @skc_family: network address family
   *     @skc_state: Connection state
   *     @skc_reuse: %SO_REUSEADDR setting
   *     @skc_reuseport: %SO_REUSEPORT setting
+ *     @skc_ipv6only: socket is IPV6 only
+ *     @skc_net_refcnt: socket is using net ref counting
   *     @skc_bound_dev_if: bound device index if != 0
   *     @skc_bind_node: bind hash linkage for various protocol lookup tables
   *     @skc_portaddr_node: second hash linkage for UDP/UDP-Lite protocol
   *     @skc_prot: protocol handlers inside a network family
   *     @skc_net: reference to the network namespace of this socket
+ *     @skc_v6_daddr: IPV6 destination address
+ *     @skc_v6_rcv_saddr: IPV6 source address
+ *     @skc_cookie: socket's cookie value
   *     @skc_node: main hash linkage for various protocol lookup tables
   *     @skc_nulls_node: main hash linkage for TCP/UDP/UDP-Lite protocol
   *     @skc_tx_queue_mapping: tx queue number for this connection
@@ -137,7 +144,15 @@ typedef __u64 __bitwise __addrpair;
   *     @skc_flags: place holder for sk_flags
   *             %SO_LINGER (l_onoff), %SO_BROADCAST, %SO_KEEPALIVE,
   *             %SO_OOBINLINE settings, %SO_TIMESTAMPING settings
+ *     @skc_listener: connection request listener socket (aka rsk_listener)
+ *             [union with @skc_flags]
+ *     @skc_tw_dr: (aka tw_dr) ptr to &struct inet_timewait_death_row
+ *             [union with @skc_flags]
   *     @skc_incoming_cpu: record/match cpu processing incoming packets
+ *     @skc_rcv_wnd: (aka rsk_rcv_wnd) TCP receive window size (possibly scaled)
+ *             [union with @skc_incoming_cpu]
+ *     @skc_tw_rcv_nxt: (aka tw_rcv_nxt) TCP window next expected seq number
+ *             [union with @skc_incoming_cpu]
   *     @skc_refcnt: reference count
   *
   *     This is the minimal network layer representation of sockets, the header
@@ -245,6 +260,7 @@ struct bpf_sk_storage;
    *    @sk_dst_cache: destination cache
    *    @sk_dst_pending_confirm: need to confirm neighbour
    *    @sk_policy: flow policy
+  *    @sk_rx_skb_cache: cache copy of recently accessed RX skb
    *    @sk_receive_queue: incoming packets
    *    @sk_wmem_alloc: transmit queue bytes committed
    *    @sk_tsq_flags: TCP Small Queues flags
@@ -265,6 +281,8 @@ struct bpf_sk_storage;
    *    @sk_no_check_rx: allow zero checksum in RX packets
    *    @sk_route_caps: route capabilities (e.g. %NETIF_F_TSO)
    *    @sk_route_nocaps: forbidden route capabilities (e.g NETIF_F_GSO_MASK)
+  *    @sk_route_forced_caps: static, forced route capabilities
+  *            (set in tcp_init_sock())
    *    @sk_gso_type: GSO type (e.g. %SKB_GSO_TCPV4)
    *    @sk_gso_max_size: Maximum GSO segment size to build
    *    @sk_gso_max_segs: Maximum number of GSO segments
@@ -303,6 +321,8 @@ struct bpf_sk_storage;
    *    @sk_frag: cached page frag
    *    @sk_peek_off: current peek_offset value
    *    @sk_send_head: front of stuff to transmit
+  *    @tcp_rtx_queue: TCP re-transmit queue [union with @sk_send_head]
+  *    @sk_tx_skb_cache: cache copy of recently accessed TX skb
    *    @sk_security: used by security modules
    *    @sk_mark: generic packet mark
    *    @sk_cgrp_data: cgroup data for this cgroup
@@ -313,11 +333,14 @@ struct bpf_sk_storage;
    *    @sk_write_space: callback to indicate there is bf sending space available
    *    @sk_error_report: callback to indicate errors (e.g. %MSG_ERRQUEUE)
    *    @sk_backlog_rcv: callback to process the backlog
+  *    @sk_validate_xmit_skb: ptr to an optional validate function
    *    @sk_destruct: called at sock freeing time, i.e. when all refcnt == 0
    *    @sk_reuseport_cb: reuseport group container
+  *    @sk_bpf_storage: ptr to cache and control for bpf_sk_storage
    *    @sk_rcu: used during RCU grace period
    *    @sk_clockid: clockid used by time-based scheduling (SO_TXTIME)
    *    @sk_txtime_deadline_mode: set deadline mode for SO_TXTIME
+  *    @sk_txtime_report_errors: set report errors mode for SO_TXTIME
    *    @sk_txtime_unused: unused txtime flags
    */
  struct sock {
@@ -393,7 +416,9 @@ struct sock {
         struct sk_filter __rcu  *sk_filter;
         union {
                 struct socket_wq __rcu  *sk_wq;
+               /* private: */
                 struct socket_wq        *sk_wq_raw;
+               /* public: */
         };
  #ifdef CONFIG_XFRM
         struct xfrm_policy __rcu *sk_policy[2];
@@ -2017,7 +2042,7 @@ static inline int skb_copy_to_page_nocache(struct sock *sk, struct iov_iter *fro
   * sk_wmem_alloc_get - returns write allocations
   * @sk: socket
   *
- * Returns sk_wmem_alloc minus initial offset of one
+ * Return: sk_wmem_alloc minus initial offset of one
   */
  static inline int sk_wmem_alloc_get(const struct sock *sk)
  {
@@ -2028,7 +2053,7 @@ static inline int sk_wmem_alloc_get(const struct sock *sk)
   * sk_rmem_alloc_get - returns read allocations
   * @sk: socket
   *
- * Returns sk_rmem_alloc
+ * Return: sk_rmem_alloc
   */
  static inline int sk_rmem_alloc_get(const struct sock *sk)
  {
@@ -2039,7 +2064,7 @@ static inline int sk_rmem_alloc_get(const struct sock *sk)
   * sk_has_allocations - check if allocations are outstanding
   * @sk: socket
   *
- * Returns true if socket has write or read allocations
+ * Return: true if socket has write or read allocations
   */
  static inline bool sk_has_allocations(const struct sock *sk)
  {
@@ -2050,7 +2075,7 @@ static inline bool sk_has_allocations(const struct sock *sk)
   * skwq_has_sleeper - check if there are any waiting processes
   * @wq: struct socket_wq
   *
- * Returns true if socket_wq has waiting processes
+ * Return: true if socket_wq has waiting processes
   *
   * The purpose of the skwq_has_sleeper and sock_poll_wait is to wrap the memory
   * barrier call. They were added due to the race found within the tcp code.
@@ -2238,6 +2263,9 @@ struct sk_buff *sk_stream_alloc_skb(struct sock *sk, int size, gfp_t gfp,
   * gfpflags_allow_blocking() isn't enough here as direct reclaim may nest
   * inside other socket operations and end up recursing into sk_page_frag()
   * while it's already in use.
+ *
+ * Return: a per task page_frag if context allows that,
+ * otherwise a per socket one.
   */
  static inline struct page_frag *sk_page_frag(struct sock *sk)
  {
@@ -2432,6 +2460,7 @@ static inline void skb_setup_tx_timestamp(struct sk_buff *skb, __u16 tsflags)
                            &skb_shinfo(skb)->tskey);
  }
  
+DECLARE_STATIC_KEY_FALSE(tcp_rx_skb_cache_key);
  /**
   * sk_eat_skb - Release a skb if it is no longer needed
   * @sk: socket to eat this skb from
@@ -2440,7 +2469,6 @@ static inline void skb_setup_tx_timestamp(struct sk_buff *skb, __u16 tsflags)
   * This routine must be called with interrupts disabled or with the socket
   * locked so that the sk_buff queue operation is ok.
  */
-DECLARE_STATIC_KEY_FALSE(tcp_rx_skb_cache_key);
  static inline void sk_eat_skb(struct sock *sk, struct sk_buff *skb)
  {
         __skb_unlink(skb, &sk->sk_receive_queue);
diff --git a/include/scsi/iscsi_proto.h b/include/scsi/iscsi_proto.h

index 533f56733ba84053e1ecb6f30d2fa344b900dda2..b71b5c4f418c5e5e5460bcbcbdb9742226ce70c3 100644 (file)
--- a/include/scsi/iscsi_proto.h
+++ b/include/scsi/iscsi_proto.h
@@ -627,7 +627,6 @@ struct iscsi_reject {
  #define ISCSI_REASON_BOOKMARK_INVALID  9
  #define ISCSI_REASON_BOOKMARK_NO_RESOURCES     10
  #define ISCSI_REASON_NEGOTIATION_RESET 11
-#define ISCSI_REASON_WAITING_FOR_LOGOUT        12
  
  /* Max. number of Key=Value pairs in a text message */
  #define MAX_KEY_VALUE_PAIRS    8192
diff --git a/include/sound/rawmidi.h b/include/sound/rawmidi.h

index 40ab20439fee21d650fdb9d72beb817ceb4b71aa..a36b7227a15ad5dee698a60d3c15d5fae63c04ee 100644 (file)
--- a/include/sound/rawmidi.h
+++ b/include/sound/rawmidi.h
@@ -77,9 +77,9 @@ struct snd_rawmidi_substream {
         struct list_head list;          /* list of all substream for given stream */
         int stream;                     /* direction */
         int number;                     /* substream number */
-       unsigned int opened: 1,         /* open flag */
-                    append: 1,         /* append flag (merge more streams) */
-                    active_sensing: 1; /* send active sensing when close */
+       bool opened;                    /* open flag */
+       bool append;                    /* append flag (merge more streams) */
+       bool active_sensing;            /* send active sensing when close */
         int use_count;                  /* use counter (for output) */
         size_t bytes;
         struct snd_rawmidi *rmidi;
diff --git a/include/sound/soc-dapm.h b/include/sound/soc-dapm.h

index 2a306c6f3fbcb7595302d9a30f70d9f62a05ca65..1b6afbc1a4ed10676c0bc188b1065d1936c2427c 100644 (file)
--- a/include/sound/soc-dapm.h
+++ b/include/sound/soc-dapm.h
@@ -392,8 +392,6 @@ int snd_soc_dapm_get_enum_double(struct snd_kcontrol *kcontrol,
         struct snd_ctl_elem_value *ucontrol);
  int snd_soc_dapm_put_enum_double(struct snd_kcontrol *kcontrol,
         struct snd_ctl_elem_value *ucontrol);
-int snd_soc_dapm_put_enum_double_locked(struct snd_kcontrol *kcontrol,
-       struct snd_ctl_elem_value *ucontrol);
  int snd_soc_dapm_info_pin_switch(struct snd_kcontrol *kcontrol,
         struct snd_ctl_elem_info *uinfo);
  int snd_soc_dapm_get_pin_switch(struct snd_kcontrol *kcontrol,
diff --git a/include/uapi/asm-generic/posix_types.h b/include/uapi/asm-generic/posix_types.h

index 2f9c80595ba771e41767beb5d814b3204a39d332..b5f7594eee7ab26ced0d3041ffa04a2662a39818 100644 (file)
--- a/include/uapi/asm-generic/posix_types.h
+++ b/include/uapi/asm-generic/posix_types.h
@@ -87,7 +87,9 @@ typedef struct {
  typedef __kernel_long_t        __kernel_off_t;
  typedef long long      __kernel_loff_t;
  typedef __kernel_long_t        __kernel_old_time_t;
+#ifndef __KERNEL__
  typedef __kernel_long_t        __kernel_time_t;
+#endif
  typedef long long __kernel_time64_t;
  typedef __kernel_long_t        __kernel_clock_t;
  typedef int            __kernel_timer_t;
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h

index f1d74a2bd23493635afcbd6a7336c2fd2806f471..22f235260a3a352cf4223249fc1d26f2fce62f34 100644 (file)
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1045,9 +1045,9 @@ union bpf_attr {
   *             supports redirection to the egress interface, and accepts no
   *             flag at all.
   *
- *             The same effect can be attained with the more generic
- *             **bpf_redirect_map**\ (), which requires specific maps to be
- *             used but offers better performance.
+ *             The same effect can also be attained with the more generic
+ *             **bpf_redirect_map**\ (), which uses a BPF map to store the
+ *             redirect target instead of providing it directly to the helper.
   *     Return
   *             For XDP, the helper returns **XDP_REDIRECT** on success or
   *             **XDP_ABORTED** on error. For other program types, the values
@@ -1611,13 +1611,11 @@ union bpf_attr {
   *             the caller. Any higher bits in the *flags* argument must be
   *             unset.
   *
- *             When used to redirect packets to net devices, this helper
- *             provides a high performance increase over **bpf_redirect**\ ().
- *             This is due to various implementation details of the underlying
- *             mechanisms, one of which is the fact that **bpf_redirect_map**\
- *             () tries to send packet as a "bulk" to the device.
+ *             See also bpf_redirect(), which only supports redirecting to an
+ *             ifindex, but doesn't require a map to do so.
   *     Return
- *             **XDP_REDIRECT** on success, or **XDP_ABORTED** on error.
+ *             **XDP_REDIRECT** on success, or the value of the two lower bits
+ *             of the **flags* argument on error.
   *
   * int bpf_sk_redirect_map(struct sk_buff *skb, struct bpf_map *map, u32 key, u64 flags)
   *     Description
diff --git a/include/uapi/linux/netfilter/nf_conntrack_common.h b/include/uapi/linux/netfilter/nf_conntrack_common.h

index 336014bf8868c3f04b92cdbf31bdf2ccafc68a71..b6f0bb1dc7998e67add1a97a62b69a07f68147e2 100644 (file)
--- a/include/uapi/linux/netfilter/nf_conntrack_common.h
+++ b/include/uapi/linux/netfilter/nf_conntrack_common.h
@@ -97,6 +97,15 @@ enum ip_conntrack_status {
         IPS_UNTRACKED_BIT = 12,
         IPS_UNTRACKED = (1 << IPS_UNTRACKED_BIT),
  
+#ifdef __KERNEL__
+       /* Re-purposed for in-kernel use:
+        * Tags a conntrack entry that clashed with an existing entry
+        * on insert.
+        */
+       IPS_NAT_CLASH_BIT = IPS_UNTRACKED_BIT,
+       IPS_NAT_CLASH = IPS_UNTRACKED,
+#endif
+
         /* Conntrack got a helper explicitly attached via CT target. */
         IPS_HELPER_BIT = 13,
         IPS_HELPER = (1 << IPS_HELPER_BIT),
@@ -110,7 +119,8 @@ enum ip_conntrack_status {
          */
         IPS_UNCHANGEABLE_MASK = (IPS_NAT_DONE_MASK | IPS_NAT_MASK |
                                  IPS_EXPECTED | IPS_CONFIRMED | IPS_DYING |
-                                IPS_SEQ_ADJUST | IPS_TEMPLATE | IPS_OFFLOAD),
+                                IPS_SEQ_ADJUST | IPS_TEMPLATE | IPS_UNTRACKED |
+                                IPS_OFFLOAD),
  
         __IPS_MAX_BIT = 15,
  };
diff --git a/include/uapi/linux/swab.h b/include/uapi/linux/swab.h

index fa7f97da5b7687125c5ca5eb1527419639a1b8e8..7272f85d6d6ab5e24dd806e7f88760eaafcd7326 100644 (file)
--- a/include/uapi/linux/swab.h
+++ b/include/uapi/linux/swab.h
@@ -135,9 +135,9 @@ static inline __attribute_const__ __u32 __fswahb32(__u32 val)
  
  static __always_inline unsigned long __swab(const unsigned long y)
  {
-#if BITS_PER_LONG == 64
+#if __BITS_PER_LONG == 64
         return __swab64(y);
-#else /* BITS_PER_LONG == 32 */
+#else /* __BITS_PER_LONG == 32 */
         return __swab32(y);
  #endif
  }
diff --git a/include/uapi/linux/time.h b/include/uapi/linux/time.h

index a655aa28dc6efcbf0da9a845bc1cbd09e47545e3..4f4b6e48e01c426560483c6c4277228334cbfddf 100644 (file)
--- a/include/uapi/linux/time.h
+++ b/include/uapi/linux/time.h
@@ -5,6 +5,7 @@
  #include <linux/types.h>
  #include <linux/time_types.h>
  
+#ifndef __KERNEL__
  #ifndef _STRUCT_TIMESPEC
  #define _STRUCT_TIMESPEC
  struct timespec {
@@ -18,6 +19,17 @@ struct timeval {
         __kernel_suseconds_t    tv_usec;        /* microseconds */
  };
  
+struct itimerspec {
+       struct timespec it_interval;/* timer period */
+       struct timespec it_value;       /* timer expiration */
+};
+
+struct itimerval {
+       struct timeval it_interval;/* timer interval */
+       struct timeval it_value;        /* current value */
+};
+#endif
+
  struct timezone {
         int     tz_minuteswest; /* minutes west of Greenwich */
         int     tz_dsttime;     /* type of dst correction */
@@ -31,16 +43,6 @@ struct timezone {
  #define        ITIMER_VIRTUAL          1
  #define        ITIMER_PROF             2
  
-struct itimerspec {
-       struct timespec it_interval;    /* timer period */
-       struct timespec it_value;       /* timer expiration */
-};
-
-struct itimerval {
-       struct timeval it_interval;     /* timer interval */
-       struct timeval it_value;        /* current value */
-};
-
  /*
   * The IDs of the various system clocks (for POSIX.1b interval timers):
   */
diff --git a/include/uapi/linux/usb/charger.h b/include/uapi/linux/usb/charger.h

index 5f72af35b3ed768fd3bc1e2d8ca3f4f6ecaea47f..ad22079125bff2ba34f03b74e6cfd5e2eb460479 100644 (file)
--- a/include/uapi/linux/usb/charger.h
+++ b/include/uapi/linux/usb/charger.h
@@ -14,18 +14,18 @@
   * ACA (Accessory Charger Adapters)
   */
  enum usb_charger_type {
-       UNKNOWN_TYPE,
-       SDP_TYPE,
-       DCP_TYPE,
-       CDP_TYPE,
-       ACA_TYPE,
+       UNKNOWN_TYPE = 0,
+       SDP_TYPE = 1,
+       DCP_TYPE = 2,
+       CDP_TYPE = 3,
+       ACA_TYPE = 4,
  };
  
  /* USB charger state */
  enum usb_charger_state {
-       USB_CHARGER_DEFAULT,
-       USB_CHARGER_PRESENT,
-       USB_CHARGER_ABSENT,
+       USB_CHARGER_DEFAULT = 0,
+       USB_CHARGER_PRESENT = 1,
+       USB_CHARGER_ABSENT = 2,
  };
  
  #endif /* _UAPI__LINUX_USB_CHARGER_H */
diff --git a/init/Kconfig b/init/Kconfig

index cfee56c151f14fe390a5d4780b840313526c04bb..20a6ac33761c98a6bb7466d5ec1a5e476b1eddda 100644 (file)
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1226,14 +1226,12 @@ endif
  
  config BOOT_CONFIG
         bool "Boot config support"
-       depends on BLK_DEV_INITRD
-       select LIBXBC
-       default y
+       select BLK_DEV_INITRD
         help
           Extra boot config allows system admin to pass a config file as
           complemental extension of kernel cmdline when booting.
           The boot config file must be attached at the end of initramfs
-         with checksum and size.
+         with checksum, size and magic word.
           See <file:Documentation/admin-guide/bootconfig.rst> for details.
  
           If unsure, say Y.
diff --git a/init/main.c b/init/main.c

index cc0ee4873419ce36558a3ec8d2561f78d6407e1c..ee4947af823f3bb81ce5502781d24c4de6d3eb7f 100644 (file)
--- a/init/main.c
+++ b/init/main.c
@@ -142,6 +142,15 @@ static char *extra_command_line;
  /* Extra init arguments */
  static char *extra_init_args;
  
+#ifdef CONFIG_BOOT_CONFIG
+/* Is bootconfig on command line? */
+static bool bootconfig_found;
+static bool initargs_found;
+#else
+# define bootconfig_found false
+# define initargs_found false
+#endif
+
  static char *execute_command;
  static char *ramdisk_execute_command;
  
@@ -259,7 +268,6 @@ static int __init xbc_snprint_cmdline(char *buf, size_t size,
  {
         struct xbc_node *knode, *vnode;
         char *end = buf + size;
-       char c = '\"';
         const char *val;
         int ret;
  
@@ -270,25 +278,20 @@ static int __init xbc_snprint_cmdline(char *buf, size_t size,
                         return ret;
  
                 vnode = xbc_node_get_child(knode);
-               ret = snprintf(buf, rest(buf, end), "%s%c", xbc_namebuf,
-                               vnode ? '=' : ' ');
-               if (ret < 0)
-                       return ret;
-               buf += ret;
-               if (!vnode)
+               if (!vnode) {
+                       ret = snprintf(buf, rest(buf, end), "%s ", xbc_namebuf);
+                       if (ret < 0)
+                               return ret;
+                       buf += ret;
                         continue;
-
-               c = '\"';
+               }
                 xbc_array_for_each_value(vnode, val) {
-                       ret = snprintf(buf, rest(buf, end), "%c%s", c, val);
+                       ret = snprintf(buf, rest(buf, end), "%s=\"%s\" ",
+                                      xbc_namebuf, val);
                         if (ret < 0)
                                 return ret;
                         buf += ret;
-                       c = ',';
                 }
-               if (rest(buf, end) > 2)
-                       strcpy(buf, "\" ");
-               buf += 2;
         }
  
         return buf - (end - size);
@@ -326,7 +329,7 @@ static char * __init xbc_make_cmdline(const char *key)
         return new_cmdline;
  }
  
-u32 boot_config_checksum(unsigned char *p, u32 size)
+static u32 boot_config_checksum(unsigned char *p, u32 size)
  {
         u32 ret = 0;
  
@@ -336,23 +339,40 @@ u32 boot_config_checksum(unsigned char *p, u32 size)
         return ret;
  }
  
+static int __init bootconfig_params(char *param, char *val,
+                                   const char *unused, void *arg)
+{
+       if (strcmp(param, "bootconfig") == 0) {
+               bootconfig_found = true;
+       } else if (strcmp(param, "--") == 0) {
+               initargs_found = true;
+       }
+       return 0;
+}
+
  static void __init setup_boot_config(const char *cmdline)
  {
+       static char tmp_cmdline[COMMAND_LINE_SIZE] __initdata;
         u32 size, csum;
         char *data, *copy;
-       const char *p;
         u32 *hdr;
         int ret;
  
-       p = strstr(cmdline, "bootconfig");
-       if (!p || (p != cmdline && !isspace(*(p-1))) ||
-           (p[10] && !isspace(p[10])))
+       strlcpy(tmp_cmdline, boot_command_line, COMMAND_LINE_SIZE);
+       parse_args("bootconfig", tmp_cmdline, NULL, 0, 0, 0, NULL,
+                  bootconfig_params);
+
+       if (!bootconfig_found)
                 return;
  
         if (!initrd_end)
                 goto not_found;
  
-       hdr = (u32 *)(initrd_end - 8);
+       data = (char *)initrd_end - BOOTCONFIG_MAGIC_LEN;
+       if (memcmp(data, BOOTCONFIG_MAGIC, BOOTCONFIG_MAGIC_LEN))
+               goto not_found;
+
+       hdr = (u32 *)(data - 8);
         size = hdr[0];
         csum = hdr[1];
  
@@ -396,6 +416,14 @@ not_found:
  }
  #else
  #define setup_boot_config(cmdline)     do { } while (0)
+
+static int __init warn_bootconfig(char *str)
+{
+       pr_warn("WARNING: 'bootconfig' found on the kernel command line but CONFIG_BOOTCONFIG is not set.\n");
+       return 0;
+}
+early_param("bootconfig", warn_bootconfig);
+
  #endif
  
  /* Change NUL term back to "=", to make "param" the whole string. */
@@ -562,11 +590,12 @@ static void __init setup_command_line(char *command_line)
                  * to init.
                  */
                 len = strlen(saved_command_line);
-               if (!strstr(boot_command_line, " -- ")) {
+               if (initargs_found) {
+                       saved_command_line[len++] = ' ';
+               } else {
                         strcpy(saved_command_line + len, " -- ");
                         len += 4;
-               } else
-                       saved_command_line[len++] = ' ';
+               }
  
                 strcpy(saved_command_line + len, extra_init_args);
         }
diff --git a/ipc/sem.c b/ipc/sem.c

index 4f4303f320776cfb0b7f8de192d9f43e72a34087..3687b71151b3921860613dcc7089fa2831f6237e 100644 (file)
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -2384,11 +2384,9 @@ void exit_sem(struct task_struct *tsk)
                 ipc_assert_locked_object(&sma->sem_perm);
                 list_del(&un->list_id);
  
-               /* we are the last process using this ulp, acquiring ulp->lock
-                * isn't required. Besides that, we are also protected against
-                * IPC_RMID as we hold sma->sem_perm lock now
-                */
+               spin_lock(&ulp->lock);
                 list_del_rcu(&un->list_proc);
+               spin_unlock(&ulp->lock);
  
                 /* perform adjustments registered in un */
                 for (i = 0; i < sma->sem_nsems; i++) {
diff --git a/kernel/audit.c b/kernel/audit.c

index 17b0d523afb35cea50ee0ea8c5063c08b4cd209e..9ddfe2aa6671ff3d6558ee86c97cb64c5a9b3189 100644 (file)
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1101,13 +1101,11 @@ static void audit_log_feature_change(int which, u32 old_feature, u32 new_feature
         audit_log_end(ab);
  }
  
-static int audit_set_feature(struct sk_buff *skb)
+static int audit_set_feature(struct audit_features *uaf)
  {
-       struct audit_features *uaf;
         int i;
  
         BUILD_BUG_ON(AUDIT_LAST_FEATURE + 1 > ARRAY_SIZE(audit_feature_names));
-       uaf = nlmsg_data(nlmsg_hdr(skb));
  
         /* if there is ever a version 2 we should handle that here */
  
@@ -1175,6 +1173,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
  {
         u32                     seq;
         void                    *data;
+       int                     data_len;
         int                     err;
         struct audit_buffer     *ab;
         u16                     msg_type = nlh->nlmsg_type;
@@ -1188,6 +1187,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
  
         seq  = nlh->nlmsg_seq;
         data = nlmsg_data(nlh);
+       data_len = nlmsg_len(nlh);
  
         switch (msg_type) {
         case AUDIT_GET: {
@@ -1211,7 +1211,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
                 struct audit_status     s;
                 memset(&s, 0, sizeof(s));
                 /* guard against past and future API changes */
-               memcpy(&s, data, min_t(size_t, sizeof(s), nlmsg_len(nlh)));
+               memcpy(&s, data, min_t(size_t, sizeof(s), data_len));
                 if (s.mask & AUDIT_STATUS_ENABLED) {
                         err = audit_set_enabled(s.enabled);
                         if (err < 0)
@@ -1315,7 +1315,9 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
                         return err;
                 break;
         case AUDIT_SET_FEATURE:
-               err = audit_set_feature(skb);
+               if (data_len < sizeof(struct audit_features))
+                       return -EINVAL;
+               err = audit_set_feature(data);
                 if (err)
                         return err;
                 break;
@@ -1327,6 +1329,8 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
  
                 err = audit_filter(msg_type, AUDIT_FILTER_USER);
                 if (err == 1) { /* match or error */
+                       char *str = data;
+
                         err = 0;
                         if (msg_type == AUDIT_USER_TTY) {
                                 err = tty_audit_push();
@@ -1334,26 +1338,24 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
                                         break;
                         }
                         audit_log_user_recv_msg(&ab, msg_type);
-                       if (msg_type != AUDIT_USER_TTY)
+                       if (msg_type != AUDIT_USER_TTY) {
+                               /* ensure NULL termination */
+                               str[data_len - 1] = '\0';
                                 audit_log_format(ab, " msg='%.*s'",
                                                  AUDIT_MESSAGE_TEXT_MAX,
-                                                (char *)data);
-                       else {
-                               int size;
-
+                                                str);
+                       } else {
                                 audit_log_format(ab, " data=");
-                               size = nlmsg_len(nlh);
-                               if (size > 0 &&
-                                   ((unsigned char *)data)[size - 1] == '\0')
-                                       size--;
-                               audit_log_n_untrustedstring(ab, data, size);
+                               if (data_len > 0 && str[data_len - 1] == '\0')
+                                       data_len--;
+                               audit_log_n_untrustedstring(ab, str, data_len);
                         }
                         audit_log_end(ab);
                 }
                 break;
         case AUDIT_ADD_RULE:
         case AUDIT_DEL_RULE:
-               if (nlmsg_len(nlh) < sizeof(struct audit_rule_data))
+               if (data_len < sizeof(struct audit_rule_data))
                         return -EINVAL;
                 if (audit_enabled == AUDIT_LOCKED) {
                         audit_log_common_recv_msg(audit_context(), &ab,
@@ -1365,7 +1367,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
                         audit_log_end(ab);
                         return -EPERM;
                 }
-               err = audit_rule_change(msg_type, seq, data, nlmsg_len(nlh));
+               err = audit_rule_change(msg_type, seq, data, data_len);
                 break;
         case AUDIT_LIST_RULES:
                 err = audit_list_rules_send(skb, seq);
@@ -1380,7 +1382,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
         case AUDIT_MAKE_EQUIV: {
                 void *bufp = data;
                 u32 sizes[2];
-               size_t msglen = nlmsg_len(nlh);
+               size_t msglen = data_len;
                 char *old, *new;
  
                 err = -EINVAL;
@@ -1456,7 +1458,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
  
                 memset(&s, 0, sizeof(s));
                 /* guard against past and future API changes */
-               memcpy(&s, data, min_t(size_t, sizeof(s), nlmsg_len(nlh)));
+               memcpy(&s, data, min_t(size_t, sizeof(s), data_len));
                 /* check if new data is valid */
                 if ((s.enabled != 0 && s.enabled != 1) ||
                     (s.log_passwd != 0 && s.log_passwd != 1))
diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c

index b0126e9c0743e8d8adf07bc12404c852b2eab29b..026e34da4ace994e75408990b564d52e25352eea 100644 (file)
--- a/kernel/auditfilter.c
+++ b/kernel/auditfilter.c
@@ -456,6 +456,7 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data,
         bufp = data->buf;
         for (i = 0; i < data->field_count; i++) {
                 struct audit_field *f = &entry->rule.fields[i];
+               u32 f_val;
  
                 err = -EINVAL;
  
@@ -464,12 +465,12 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data,
                         goto exit_free;
  
                 f->type = data->fields[i];
-               f->val = data->values[i];
+               f_val = data->values[i];
  
                 /* Support legacy tests for a valid loginuid */
-               if ((f->type == AUDIT_LOGINUID) && (f->val == AUDIT_UID_UNSET)) {
+               if ((f->type == AUDIT_LOGINUID) && (f_val == AUDIT_UID_UNSET)) {
                         f->type = AUDIT_LOGINUID_SET;
-                       f->val = 0;
+                       f_val = 0;
                         entry->rule.pflags |= AUDIT_LOGINUID_LEGACY;
                 }
  
@@ -485,7 +486,7 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data,
                 case AUDIT_SUID:
                 case AUDIT_FSUID:
                 case AUDIT_OBJ_UID:
-                       f->uid = make_kuid(current_user_ns(), f->val);
+                       f->uid = make_kuid(current_user_ns(), f_val);
                         if (!uid_valid(f->uid))
                                 goto exit_free;
                         break;
@@ -494,11 +495,12 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data,
                 case AUDIT_SGID:
                 case AUDIT_FSGID:
                 case AUDIT_OBJ_GID:
-                       f->gid = make_kgid(current_user_ns(), f->val);
+                       f->gid = make_kgid(current_user_ns(), f_val);
                         if (!gid_valid(f->gid))
                                 goto exit_free;
                         break;
                 case AUDIT_ARCH:
+                       f->val = f_val;
                         entry->rule.arch_f = f;
                         break;
                 case AUDIT_SUBJ_USER:
@@ -511,11 +513,13 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data,
                 case AUDIT_OBJ_TYPE:
                 case AUDIT_OBJ_LEV_LOW:
                 case AUDIT_OBJ_LEV_HIGH:
-                       str = audit_unpack_string(&bufp, &remain, f->val);
-                       if (IS_ERR(str))
+                       str = audit_unpack_string(&bufp, &remain, f_val);
+                       if (IS_ERR(str)) {
+                               err = PTR_ERR(str);
                                 goto exit_free;
-                       entry->rule.buflen += f->val;
-
+                       }
+                       entry->rule.buflen += f_val;
+                       f->lsm_str = str;
                         err = security_audit_rule_init(f->type, f->op, str,
                                                        (void **)&f->lsm_rule);
                         /* Keep currently invalid fields around in case they
@@ -524,68 +528,71 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data,
                                 pr_warn("audit rule for LSM \'%s\' is invalid\n",
                                         str);
                                 err = 0;
-                       }
-                       if (err) {
-                               kfree(str);
+                       } else if (err)
                                 goto exit_free;
-                       } else
-                               f->lsm_str = str;
                         break;
                 case AUDIT_WATCH:
-                       str = audit_unpack_string(&bufp, &remain, f->val);
-                       if (IS_ERR(str))
+                       str = audit_unpack_string(&bufp, &remain, f_val);
+                       if (IS_ERR(str)) {
+                               err = PTR_ERR(str);
                                 goto exit_free;
-                       entry->rule.buflen += f->val;
-
-                       err = audit_to_watch(&entry->rule, str, f->val, f->op);
+                       }
+                       err = audit_to_watch(&entry->rule, str, f_val, f->op);
                         if (err) {
                                 kfree(str);
                                 goto exit_free;
                         }
+                       entry->rule.buflen += f_val;
                         break;
                 case AUDIT_DIR:
-                       str = audit_unpack_string(&bufp, &remain, f->val);
-                       if (IS_ERR(str))
+                       str = audit_unpack_string(&bufp, &remain, f_val);
+                       if (IS_ERR(str)) {
+                               err = PTR_ERR(str);
                                 goto exit_free;
-                       entry->rule.buflen += f->val;
-
+                       }
                         err = audit_make_tree(&entry->rule, str, f->op);
                         kfree(str);
                         if (err)
                                 goto exit_free;
+                       entry->rule.buflen += f_val;
                         break;
                 case AUDIT_INODE:
+                       f->val = f_val;
                         err = audit_to_inode(&entry->rule, f);
                         if (err)
                                 goto exit_free;
                         break;
                 case AUDIT_FILTERKEY:
-                       if (entry->rule.filterkey || f->val > AUDIT_MAX_KEY_LEN)
+                       if (entry->rule.filterkey || f_val > AUDIT_MAX_KEY_LEN)
                                 goto exit_free;
-                       str = audit_unpack_string(&bufp, &remain, f->val);
-                       if (IS_ERR(str))
+                       str = audit_unpack_string(&bufp, &remain, f_val);
+                       if (IS_ERR(str)) {
+                               err = PTR_ERR(str);
                                 goto exit_free;
-                       entry->rule.buflen += f->val;
+                       }
+                       entry->rule.buflen += f_val;
                         entry->rule.filterkey = str;
                         break;
                 case AUDIT_EXE:
-                       if (entry->rule.exe || f->val > PATH_MAX)
+                       if (entry->rule.exe || f_val > PATH_MAX)
                                 goto exit_free;
-                       str = audit_unpack_string(&bufp, &remain, f->val);
+                       str = audit_unpack_string(&bufp, &remain, f_val);
                         if (IS_ERR(str)) {
                                 err = PTR_ERR(str);
                                 goto exit_free;
                         }
-                       entry->rule.buflen += f->val;
-
-                       audit_mark = audit_alloc_mark(&entry->rule, str, f->val);
+                       audit_mark = audit_alloc_mark(&entry->rule, str, f_val);
                         if (IS_ERR(audit_mark)) {
                                 kfree(str);
                                 err = PTR_ERR(audit_mark);
                                 goto exit_free;
                         }
+                       entry->rule.buflen += f_val;
                         entry->rule.exe = audit_mark;
                         break;
+               default:
+                       f->val = f_val;
+                       break;
                 }
         }
  
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c

index 805c43b083e956940b990cede2b3cad0f0e3692a..787140095e58d5d0882a50b32ad30d64dcb89a75 100644 (file)
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -4142,9 +4142,9 @@ int btf_distill_func_proto(struct bpf_verifier_log *log,
   * EFAULT - verifier bug
   * 0 - 99% match. The last 1% is validated by the verifier.
   */
-int btf_check_func_type_match(struct bpf_verifier_log *log,
-                             struct btf *btf1, const struct btf_type *t1,
-                             struct btf *btf2, const struct btf_type *t2)
+static int btf_check_func_type_match(struct bpf_verifier_log *log,
+                                    struct btf *btf1, const struct btf_type *t1,
+                                    struct btf *btf2, const struct btf_type *t2)
  {
         const struct btf_param *args1, *args2;
         const char *fn1, *fn2, *s1, *s2;
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c

index 2d182c4ee9d9964a6ec55ea102a5de9c7b6fc811..a1468e3f5af24e33acbfaa392db9770cea249668 100644 (file)
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -56,6 +56,7 @@ struct htab_elem {
                         union {
                                 struct bpf_htab *htab;
                                 struct pcpu_freelist_node fnode;
+                               struct htab_elem *batch_flink;
                         };
                 };
         };
@@ -126,6 +127,17 @@ free_elems:
         bpf_map_area_free(htab->elems);
  }
  
+/* The LRU list has a lock (lru_lock). Each htab bucket has a lock
+ * (bucket_lock). If both locks need to be acquired together, the lock
+ * order is always lru_lock -> bucket_lock and this only happens in
+ * bpf_lru_list.c logic. For example, certain code path of
+ * bpf_lru_pop_free(), which is called by function prealloc_lru_pop(),
+ * will acquire lru_lock first followed by acquiring bucket_lock.
+ *
+ * In hashtab.c, to avoid deadlock, lock acquisition of
+ * bucket_lock followed by lru_lock is not allowed. In such cases,
+ * bucket_lock needs to be released first before acquiring lru_lock.
+ */
  static struct htab_elem *prealloc_lru_pop(struct bpf_htab *htab, void *key,
                                           u32 hash)
  {
@@ -1256,10 +1268,12 @@ __htab_map_lookup_and_delete_batch(struct bpf_map *map,
         void __user *ukeys = u64_to_user_ptr(attr->batch.keys);
         void *ubatch = u64_to_user_ptr(attr->batch.in_batch);
         u32 batch, max_count, size, bucket_size;
+       struct htab_elem *node_to_free = NULL;
         u64 elem_map_flags, map_flags;
         struct hlist_nulls_head *head;
         struct hlist_nulls_node *n;
-       unsigned long flags;
+       unsigned long flags = 0;
+       bool locked = false;
         struct htab_elem *l;
         struct bucket *b;
         int ret = 0;
@@ -1319,15 +1333,25 @@ again_nocopy:
         dst_val = values;
         b = &htab->buckets[batch];
         head = &b->head;
-       raw_spin_lock_irqsave(&b->lock, flags);
+       /* do not grab the lock unless need it (bucket_cnt > 0). */
+       if (locked)
+               raw_spin_lock_irqsave(&b->lock, flags);
  
         bucket_cnt = 0;
         hlist_nulls_for_each_entry_rcu(l, n, head, hash_node)
                 bucket_cnt++;
  
+       if (bucket_cnt && !locked) {
+               locked = true;
+               goto again_nocopy;
+       }
+
         if (bucket_cnt > (max_count - total)) {
                 if (total == 0)
                         ret = -ENOSPC;
+               /* Note that since bucket_cnt > 0 here, it is implicit
+                * that the locked was grabbed, so release it.
+                */
                 raw_spin_unlock_irqrestore(&b->lock, flags);
                 rcu_read_unlock();
                 this_cpu_dec(bpf_prog_active);
@@ -1337,6 +1361,9 @@ again_nocopy:
  
         if (bucket_cnt > bucket_size) {
                 bucket_size = bucket_cnt;
+               /* Note that since bucket_cnt > 0 here, it is implicit
+                * that the locked was grabbed, so release it.
+                */
                 raw_spin_unlock_irqrestore(&b->lock, flags);
                 rcu_read_unlock();
                 this_cpu_dec(bpf_prog_active);
@@ -1346,6 +1373,10 @@ again_nocopy:
                 goto alloc;
         }
  
+       /* Next block is only safe to run if you have grabbed the lock */
+       if (!locked)
+               goto next_batch;
+
         hlist_nulls_for_each_entry_safe(l, n, head, hash_node) {
                 memcpy(dst_key, l->key, key_size);
  
@@ -1370,16 +1401,33 @@ again_nocopy:
                 }
                 if (do_delete) {
                         hlist_nulls_del_rcu(&l->hash_node);
-                       if (is_lru_map)
-                               bpf_lru_push_free(&htab->lru, &l->lru_node);
-                       else
+
+                       /* bpf_lru_push_free() will acquire lru_lock, which
+                        * may cause deadlock. See comments in function
+                        * prealloc_lru_pop(). Let us do bpf_lru_push_free()
+                        * after releasing the bucket lock.
+                        */
+                       if (is_lru_map) {
+                               l->batch_flink = node_to_free;
+                               node_to_free = l;
+                       } else {
                                 free_htab_elem(htab, l);
+                       }
                 }
                 dst_key += key_size;
                 dst_val += value_size;
         }
  
         raw_spin_unlock_irqrestore(&b->lock, flags);
+       locked = false;
+
+       while (node_to_free) {
+               l = node_to_free;
+               node_to_free = node_to_free->batch_flink;
+               bpf_lru_push_free(&htab->lru, &l->lru_node);
+       }
+
+next_batch:
         /* If we are not copying data, we can go to next bucket and avoid
          * unlocking the rcu.
          */
diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c

index 2c5dc6541eceed031acd4b0953d29b8f2a48870a..bd09290e364844a35a3c80bc56a63037356c9966 100644 (file)
--- a/kernel/bpf/offload.c
+++ b/kernel/bpf/offload.c
@@ -321,7 +321,7 @@ int bpf_prog_offload_info_fill(struct bpf_prog_info *info,
  
         ulen = info->jited_prog_len;
         info->jited_prog_len = aux->offload->jited_len;
-       if (info->jited_prog_len & ulen) {
+       if (info->jited_prog_len && ulen) {
                 uinsns = u64_to_user_ptr(info->jited_prog_insns);
                 ulen = min_t(u32, info->jited_prog_len, ulen);
                 if (copy_to_user(uinsns, aux->offload->jited_image, ulen)) {
diff --git a/kernel/compat.c b/kernel/compat.c

index 95005f849c68f372e0e9b92d7926ea711d5600fa..843dd17e6078b6d530ecc46260703b390f9abbe4 100644 (file)
--- a/kernel/compat.c
+++ b/kernel/compat.c
@@ -26,70 +26,6 @@
  
  #include <linux/uaccess.h>
  
-static int __compat_get_timeval(struct timeval *tv, const struct old_timeval32 __user *ctv)
-{
-       return (!access_ok(ctv, sizeof(*ctv)) ||
-                       __get_user(tv->tv_sec, &ctv->tv_sec) ||
-                       __get_user(tv->tv_usec, &ctv->tv_usec)) ? -EFAULT : 0;
-}
-
-static int __compat_put_timeval(const struct timeval *tv, struct old_timeval32 __user *ctv)
-{
-       return (!access_ok(ctv, sizeof(*ctv)) ||
-                       __put_user(tv->tv_sec, &ctv->tv_sec) ||
-                       __put_user(tv->tv_usec, &ctv->tv_usec)) ? -EFAULT : 0;
-}
-
-static int __compat_get_timespec(struct timespec *ts, const struct old_timespec32 __user *cts)
-{
-       return (!access_ok(cts, sizeof(*cts)) ||
-                       __get_user(ts->tv_sec, &cts->tv_sec) ||
-                       __get_user(ts->tv_nsec, &cts->tv_nsec)) ? -EFAULT : 0;
-}
-
-static int __compat_put_timespec(const struct timespec *ts, struct old_timespec32 __user *cts)
-{
-       return (!access_ok(cts, sizeof(*cts)) ||
-                       __put_user(ts->tv_sec, &cts->tv_sec) ||
-                       __put_user(ts->tv_nsec, &cts->tv_nsec)) ? -EFAULT : 0;
-}
-
-int compat_get_timeval(struct timeval *tv, const void __user *utv)
-{
-       if (COMPAT_USE_64BIT_TIME)
-               return copy_from_user(tv, utv, sizeof(*tv)) ? -EFAULT : 0;
-       else
-               return __compat_get_timeval(tv, utv);
-}
-EXPORT_SYMBOL_GPL(compat_get_timeval);
-
-int compat_put_timeval(const struct timeval *tv, void __user *utv)
-{
-       if (COMPAT_USE_64BIT_TIME)
-               return copy_to_user(utv, tv, sizeof(*tv)) ? -EFAULT : 0;
-       else
-               return __compat_put_timeval(tv, utv);
-}
-EXPORT_SYMBOL_GPL(compat_put_timeval);
-
-int compat_get_timespec(struct timespec *ts, const void __user *uts)
-{
-       if (COMPAT_USE_64BIT_TIME)
-               return copy_from_user(ts, uts, sizeof(*ts)) ? -EFAULT : 0;
-       else
-               return __compat_get_timespec(ts, uts);
-}
-EXPORT_SYMBOL_GPL(compat_get_timespec);
-
-int compat_put_timespec(const struct timespec *ts, void __user *uts)
-{
-       if (COMPAT_USE_64BIT_TIME)
-               return copy_to_user(uts, ts, sizeof(*ts)) ? -EFAULT : 0;
-       else
-               return __compat_put_timespec(ts, uts);
-}
-EXPORT_SYMBOL_GPL(compat_put_timespec);
-
  #ifdef __ARCH_WANT_SYS_SIGPROCMASK
  
  /*
diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c

index daa4e6eefdde86e898668b3c9ebdb55b2c6b59f3..8bc6f2d670f956f24f4786b7dafb1d49b69beb85 100644 (file)
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -302,9 +302,16 @@ static int __init rmem_cma_setup(struct reserved_mem *rmem)
         phys_addr_t align = PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order);
         phys_addr_t mask = align - 1;
         unsigned long node = rmem->fdt_node;
+       bool default_cma = of_get_flat_dt_prop(node, "linux,cma-default", NULL);
         struct cma *cma;
         int err;
  
+       if (size_cmdline != -1 && default_cma) {
+               pr_info("Reserved memory: bypass %s node, using cmdline CMA params instead\n",
+                       rmem->name);
+               return -EBUSY;
+       }
+
         if (!of_get_flat_dt_prop(node, "reusable", NULL) ||
             of_get_flat_dt_prop(node, "no-map", NULL))
                 return -EINVAL;
@@ -322,7 +329,7 @@ static int __init rmem_cma_setup(struct reserved_mem *rmem)
         /* Architecture specific contiguous memory fixup. */
         dma_contiguous_early_fixup(rmem->base, rmem->size);
  
-       if (of_get_flat_dt_prop(node, "linux,cma-default", NULL))
+       if (default_cma)
                 dma_contiguous_set_default(cma);
  
         rmem->ops = &rmem_cma_ops;
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c

index 6af7ae83c4ada1534bcf7dc203de907515e24fea..ac7956c38f693f2bc6346411ff2b097a003967dd 100644 (file)
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -23,18 +23,6 @@
   */
  unsigned int zone_dma_bits __ro_after_init = 24;
  
-static void report_addr(struct device *dev, dma_addr_t dma_addr, size_t size)
-{
-       if (!dev->dma_mask) {
-               dev_err_once(dev, "DMA map on device without dma_mask\n");
-       } else if (*dev->dma_mask >= DMA_BIT_MASK(32) || dev->bus_dma_limit) {
-               dev_err_once(dev,
-                       "overflow %pad+%zu of DMA mask %llx bus limit %llx\n",
-                       &dma_addr, size, *dev->dma_mask, dev->bus_dma_limit);
-       }
-       WARN_ON_ONCE(1);
-}
-
  static inline dma_addr_t phys_to_dma_direct(struct device *dev,
                 phys_addr_t phys)
  {
@@ -357,13 +345,6 @@ void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl,
  EXPORT_SYMBOL(dma_direct_unmap_sg);
  #endif
  
-static inline bool dma_direct_possible(struct device *dev, dma_addr_t dma_addr,
-               size_t size)
-{
-       return swiotlb_force != SWIOTLB_FORCE &&
-               dma_capable(dev, dma_addr, size, true);
-}
-
  dma_addr_t dma_direct_map_page(struct device *dev, struct page *page,
                 unsigned long offset, size_t size, enum dma_data_direction dir,
                 unsigned long attrs)
@@ -371,9 +352,16 @@ dma_addr_t dma_direct_map_page(struct device *dev, struct page *page,
         phys_addr_t phys = page_to_phys(page) + offset;
         dma_addr_t dma_addr = phys_to_dma(dev, phys);
  
-       if (unlikely(!dma_direct_possible(dev, dma_addr, size)) &&
-           !swiotlb_map(dev, &phys, &dma_addr, size, dir, attrs)) {
-               report_addr(dev, dma_addr, size);
+       if (unlikely(swiotlb_force == SWIOTLB_FORCE))
+               return swiotlb_map(dev, phys, size, dir, attrs);
+
+       if (unlikely(!dma_capable(dev, dma_addr, size, true))) {
+               if (swiotlb_force != SWIOTLB_NO_FORCE)
+                       return swiotlb_map(dev, phys, size, dir, attrs);
+
+               dev_WARN_ONCE(dev, 1,
+                            "DMA addr %pad+%zu overflow (mask %llx, bus limit %llx).\n",
+                            &dma_addr, size, *dev->dma_mask, dev->bus_dma_limit);
                 return DMA_MAPPING_ERROR;
         }
  
@@ -411,7 +399,10 @@ dma_addr_t dma_direct_map_resource(struct device *dev, phys_addr_t paddr,
         dma_addr_t dma_addr = paddr;
  
         if (unlikely(!dma_capable(dev, dma_addr, size, false))) {
-               report_addr(dev, dma_addr, size);
+               dev_err_once(dev,
+                            "DMA addr %pad+%zu overflow (mask %llx, bus limit %llx).\n",
+                            &dma_addr, size, *dev->dma_mask, dev->bus_dma_limit);
+               WARN_ON_ONCE(1);
                 return DMA_MAPPING_ERROR;
         }
  
@@ -472,28 +463,26 @@ int dma_direct_mmap(struct device *dev, struct vm_area_struct *vma,
  }
  #endif /* CONFIG_MMU */
  
-/*
- * Because 32-bit DMA masks are so common we expect every architecture to be
- * able to satisfy them - either by not supporting more physical memory, or by
- * providing a ZONE_DMA32.  If neither is the case, the architecture needs to
- * use an IOMMU instead of the direct mapping.
- */
  int dma_direct_supported(struct device *dev, u64 mask)
  {
-       u64 min_mask;
+       u64 min_mask = (max_pfn - 1) << PAGE_SHIFT;
  
-       if (IS_ENABLED(CONFIG_ZONE_DMA))
-               min_mask = DMA_BIT_MASK(zone_dma_bits);
-       else
-               min_mask = DMA_BIT_MASK(32);
-
-       min_mask = min_t(u64, min_mask, (max_pfn - 1) << PAGE_SHIFT);
+       /*
+        * Because 32-bit DMA masks are so common we expect every architecture
+        * to be able to satisfy them - either by not supporting more physical
+        * memory, or by providing a ZONE_DMA32.  If neither is the case, the
+        * architecture needs to use an IOMMU instead of the direct mapping.
+        */
+       if (mask >= DMA_BIT_MASK(32))
+               return 1;
  
         /*
          * This check needs to be against the actual bit mask value, so
          * use __phys_to_dma() here so that the SME encryption mask isn't
          * part of the check.
          */
+       if (IS_ENABLED(CONFIG_ZONE_DMA))
+               min_mask = min_t(u64, min_mask, DMA_BIT_MASK(zone_dma_bits));
         return mask >= __phys_to_dma(dev, min_mask);
  }
  
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c

index 9280d6f8271ed4360bab41cfb2d8dc66d75ca203..c19379fabd200ebb13c5a134b4f366291751982b 100644 (file)
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -22,6 +22,7 @@
  
  #include <linux/cache.h>
  #include <linux/dma-direct.h>
+#include <linux/dma-noncoherent.h>
  #include <linux/mm.h>
  #include <linux/export.h>
  #include <linux/spinlock.h>
@@ -656,35 +657,38 @@ void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
  }
  
  /*
- * Create a swiotlb mapping for the buffer at @phys, and in case of DMAing
+ * Create a swiotlb mapping for the buffer at @paddr, and in case of DMAing
   * to the device copy the data into it as well.
   */
-bool swiotlb_map(struct device *dev, phys_addr_t *phys, dma_addr_t *dma_addr,
-               size_t size, enum dma_data_direction dir, unsigned long attrs)
+dma_addr_t swiotlb_map(struct device *dev, phys_addr_t paddr, size_t size,
+               enum dma_data_direction dir, unsigned long attrs)
  {
-       trace_swiotlb_bounced(dev, *dma_addr, size, swiotlb_force);
+       phys_addr_t swiotlb_addr;
+       dma_addr_t dma_addr;
  
-       if (unlikely(swiotlb_force == SWIOTLB_NO_FORCE)) {
-               dev_warn_ratelimited(dev,
-                       "Cannot do DMA to address %pa\n", phys);
-               return false;
-       }
+       trace_swiotlb_bounced(dev, phys_to_dma(dev, paddr), size,
+                             swiotlb_force);
  
-       /* Oh well, have to allocate and map a bounce buffer. */
-       *phys = swiotlb_tbl_map_single(dev, __phys_to_dma(dev, io_tlb_start),
-                       *phys, size, size, dir, attrs);
-       if (*phys == (phys_addr_t)DMA_MAPPING_ERROR)
-               return false;
+       swiotlb_addr = swiotlb_tbl_map_single(dev,
+                       __phys_to_dma(dev, io_tlb_start),
+                       paddr, size, size, dir, attrs);
+       if (swiotlb_addr == (phys_addr_t)DMA_MAPPING_ERROR)
+               return DMA_MAPPING_ERROR;
  
         /* Ensure that the address returned is DMA'ble */
-       *dma_addr = __phys_to_dma(dev, *phys);
-       if (unlikely(!dma_capable(dev, *dma_addr, size, true))) {
-               swiotlb_tbl_unmap_single(dev, *phys, size, size, dir,
+       dma_addr = __phys_to_dma(dev, swiotlb_addr);
+       if (unlikely(!dma_capable(dev, dma_addr, size, true))) {
+               swiotlb_tbl_unmap_single(dev, swiotlb_addr, size, size, dir,
                         attrs | DMA_ATTR_SKIP_CPU_SYNC);
-               return false;
+               dev_WARN_ONCE(dev, 1,
+                       "swiotlb addr %pad+%zu overflow (mask %llx, bus limit %llx).\n",
+                       &dma_addr, size, *dev->dma_mask, dev->bus_dma_limit);
+               return DMA_MAPPING_ERROR;
         }
  
-       return true;
+       if (!dev_is_dma_coherent(dev) && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+               arch_sync_dma_for_device(swiotlb_addr, size, dir);
+       return dma_addr;
  }
  
  size_t swiotlb_max_mapping_size(struct device *dev)
diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h

index 3924fbe829d4a8aeaa32750580212b0f008d7f39..c9d8eb7f5c029244b4db5f68cae4463412064077 100644 (file)
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -128,8 +128,6 @@ static inline void unregister_handler_proc(unsigned int irq,
  
  extern bool irq_can_set_affinity_usr(unsigned int irq);
  
-extern int irq_select_affinity_usr(unsigned int irq);
-
  extern void irq_set_thread_affinity(struct irq_desc *desc);
  
  extern int irq_do_set_affinity(struct irq_data *data,
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c

index 3089a60ea8f98ffeb58c0d497d1172cc15718e1e..7eee98c38f25ca20047ed75c268fae4550949bdd 100644 (file)
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -481,23 +481,9 @@ int irq_setup_affinity(struct irq_desc *desc)
  {
         return irq_select_affinity(irq_desc_get_irq(desc));
  }
-#endif
+#endif /* CONFIG_AUTO_IRQ_AFFINITY */
+#endif /* CONFIG_SMP */
  
-/*
- * Called when a bogus affinity is set via /proc/irq
- */
-int irq_select_affinity_usr(unsigned int irq)
-{
-       struct irq_desc *desc = irq_to_desc(irq);
-       unsigned long flags;
-       int ret;
-
-       raw_spin_lock_irqsave(&desc->lock, flags);
-       ret = irq_setup_affinity(desc);
-       raw_spin_unlock_irqrestore(&desc->lock, flags);
-       return ret;
-}
-#endif
  
  /**
   *     irq_set_vcpu_affinity - Set vcpu affinity for the interrupt
diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c

index 9e5783d98033ee183e389ca9efa3681e0b642743..32c071d7bc0338ac73253c586e04de7fc816a873 100644 (file)
--- a/kernel/irq/proc.c
+++ b/kernel/irq/proc.c
@@ -111,6 +111,28 @@ static int irq_affinity_list_proc_show(struct seq_file *m, void *v)
         return show_irq_affinity(AFFINITY_LIST, m);
  }
  
+#ifndef CONFIG_AUTO_IRQ_AFFINITY
+static inline int irq_select_affinity_usr(unsigned int irq)
+{
+       /*
+        * If the interrupt is started up already then this fails. The
+        * interrupt is assigned to an online CPU already. There is no
+        * point to move it around randomly. Tell user space that the
+        * selected mask is bogus.
+        *
+        * If not then any change to the affinity is pointless because the
+        * startup code invokes irq_setup_affinity() which will select
+        * a online CPU anyway.
+        */
+       return -EINVAL;
+}
+#else
+/* ALPHA magic affinity auto selector. Keep it for historical reasons. */
+static inline int irq_select_affinity_usr(unsigned int irq)
+{
+       return irq_select_affinity(irq);
+}
+#endif
  
  static ssize_t write_irq_affinity(int type, struct file *file,
                 const char __user *buffer, size_t count, loff_t *pos)
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c

index ddade80ad27670c33f84bbbedd1328edf42a3ace..d82b7b88d616ee4ea792d6ec6340dd46f00a42ad 100644 (file)
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -1681,7 +1681,7 @@ static unsigned long minimum_image_size(unsigned long saveable)
   * hibernation for allocations made while saving the image and for device
   * drivers, in case they need to allocate memory from their hibernation
   * callbacks (these two numbers are given by PAGES_FOR_IO (which is a rough
- * estimate) and reserverd_size divided by PAGE_SIZE (which is tunable through
+ * estimate) and reserved_size divided by PAGE_SIZE (which is tunable through
   * /sys/power/reserved_size, respectively).  To make this happen, we compute the
   * total number of available page frames and allocate at least
   *
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c

index 2c47280fbfc7a4a92f89769cf7e30c7cb60784e9..8b1bb5ee7e5d668992067daacf8c5af6bc73285f 100644 (file)
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -131,11 +131,12 @@ static void s2idle_loop(void)
          * to avoid them upfront.
          */
         for (;;) {
-               if (s2idle_ops && s2idle_ops->wake)
-                       s2idle_ops->wake();
-
-               if (pm_wakeup_pending())
+               if (s2idle_ops && s2idle_ops->wake) {
+                       if (s2idle_ops->wake())
+                               break;
+               } else if (pm_wakeup_pending()) {
                         break;
+               }
  
                 pm_wakeup_clear(false);
  
diff --git a/kernel/sched/core.c b/kernel/sched/core.c

index fc1dfc0076045dc2bfcb3f0bd9b60e1ec2e7ac09..1a9983da4408deb36315639228e84a0131a36e00 100644 (file)
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -552,27 +552,32 @@ void resched_cpu(int cpu)
   */
  int get_nohz_timer_target(void)
  {
-       int i, cpu = smp_processor_id();
+       int i, cpu = smp_processor_id(), default_cpu = -1;
         struct sched_domain *sd;
  
-       if (!idle_cpu(cpu) && housekeeping_cpu(cpu, HK_FLAG_TIMER))
-               return cpu;
+       if (housekeeping_cpu(cpu, HK_FLAG_TIMER)) {
+               if (!idle_cpu(cpu))
+                       return cpu;
+               default_cpu = cpu;
+       }
  
         rcu_read_lock();
         for_each_domain(cpu, sd) {
-               for_each_cpu(i, sched_domain_span(sd)) {
+               for_each_cpu_and(i, sched_domain_span(sd),
+                       housekeeping_cpumask(HK_FLAG_TIMER)) {
                         if (cpu == i)
                                 continue;
  
-                       if (!idle_cpu(i) && housekeeping_cpu(i, HK_FLAG_TIMER)) {
+                       if (!idle_cpu(i)) {
                                 cpu = i;
                                 goto unlock;
                         }
                 }
         }
  
-       if (!housekeeping_cpu(cpu, HK_FLAG_TIMER))
-               cpu = housekeeping_any_cpu(HK_FLAG_TIMER);
+       if (default_cpu == -1)
+               default_cpu = housekeeping_any_cpu(HK_FLAG_TIMER);
+       cpu = default_cpu;
  unlock:
         rcu_read_unlock();
         return cpu;
@@ -1442,17 +1447,6 @@ void check_preempt_curr(struct rq *rq, struct task_struct *p, int flags)
  
  #ifdef CONFIG_SMP
  
-static inline bool is_per_cpu_kthread(struct task_struct *p)
-{
-       if (!(p->flags & PF_KTHREAD))
-               return false;
-
-       if (p->nr_cpus_allowed != 1)
-               return false;
-
-       return true;
-}
-
  /*
   * Per-CPU kthreads are allowed to run on !active && online CPUs, see
   * __set_cpus_allowed_ptr() and select_fallback_rq().
@@ -3669,28 +3663,32 @@ static void sched_tick_remote(struct work_struct *work)
          * statistics and checks timeslices in a time-independent way, regardless
          * of when exactly it is running.
          */
-       if (idle_cpu(cpu) || !tick_nohz_tick_stopped_cpu(cpu))
+       if (!tick_nohz_tick_stopped_cpu(cpu))
                 goto out_requeue;
  
         rq_lock_irq(rq, &rf);
         curr = rq->curr;
-       if (is_idle_task(curr) || cpu_is_offline(cpu))
+       if (cpu_is_offline(cpu))
                 goto out_unlock;
  
+       curr = rq->curr;
         update_rq_clock(rq);
-       delta = rq_clock_task(rq) - curr->se.exec_start;
  
-       /*
-        * Make sure the next tick runs within a reasonable
-        * amount of time.
-        */
-       WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
+       if (!is_idle_task(curr)) {
+               /*
+                * Make sure the next tick runs within a reasonable
+                * amount of time.
+                */
+               delta = rq_clock_task(rq) - curr->se.exec_start;
+               WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3);
+       }
         curr->sched_class->task_tick(rq, curr, 0);
  
+       calc_load_nohz_remote(rq);
  out_unlock:
         rq_unlock_irq(rq, &rf);
-
  out_requeue:
+
         /*
          * Run the remote tick once per second (1Hz). This arbitrary
          * frequency is large enough to avoid overload but short enough
@@ -7063,8 +7061,15 @@ void sched_move_task(struct task_struct *tsk)
  
         if (queued)
                 enqueue_task(rq, tsk, queue_flags);
-       if (running)
+       if (running) {
                 set_next_task(rq, tsk);
+               /*
+                * After changing group, the running task may have joined a
+                * throttled one but it's still the running task. Trigger a
+                * resched to make sure that task can still run.
+                */
+               resched_curr(rq);
+       }
  
         task_rq_unlock(rq, tsk, &rf);
  }
@@ -7260,7 +7265,7 @@ capacity_from_percent(char *buf)
                                              &req.percent);
                 if (req.ret)
                         return req;
-               if (req.percent > UCLAMP_PERCENT_SCALE) {
+               if ((u64)req.percent > UCLAMP_PERCENT_SCALE) {
                         req.ret = -ERANGE;
                         return req;
                 }
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c

index fe4e0d775375680504d1bc2167996bb975f028a8..c1217bfe5e819083ba7bbfd99fd8e3453d34f4a7 100644 (file)
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3516,7 +3516,6 @@ update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
   * attach_entity_load_avg - attach this entity to its cfs_rq load avg
   * @cfs_rq: cfs_rq to attach to
   * @se: sched_entity to attach
- * @flags: migration hints
   *
   * Must call update_cfs_rq_load_avg() before this, since we rely on
   * cfs_rq->avg.last_update_time being current.
@@ -5912,6 +5911,20 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
             (available_idle_cpu(prev) || sched_idle_cpu(prev)))
                 return prev;
  
+       /*
+        * Allow a per-cpu kthread to stack with the wakee if the
+        * kworker thread and the tasks previous CPUs are the same.
+        * The assumption is that the wakee queued work for the
+        * per-cpu kthread that is now complete and the wakeup is
+        * essentially a sync wakeup. An obvious example of this
+        * pattern is IO completions.
+        */
+       if (is_per_cpu_kthread(current) &&
+           prev == smp_processor_id() &&
+           this_rq()->nr_running <= 1) {
+               return prev;
+       }
+
         /* Check a recently used CPU as a potential idle candidate: */
         recent_used_cpu = p->recent_used_cpu;
         if (recent_used_cpu != prev &&
@@ -8324,6 +8337,8 @@ static inline void update_sg_wakeup_stats(struct sched_domain *sd,
  
         sgs->group_capacity = group->sgc->capacity;
  
+       sgs->group_weight = group->group_weight;
+
         sgs->group_type = group_classify(sd->imbalance_pct, group, sgs);
  
         /*
@@ -8658,10 +8673,6 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
         /*
          * Try to use spare capacity of local group without overloading it or
          * emptying busiest.
-        * XXX Spreading tasks across NUMA nodes is not always the best policy
-        * and special care should be taken for SD_NUMA domain level before
-        * spreading the tasks. For now, load_balance() fully relies on
-        * NUMA_BALANCING and fbq_classify_group/rq to override the decision.
          */
         if (local->group_type == group_has_spare) {
                 if (busiest->group_type > group_fully_busy) {
@@ -8701,16 +8712,37 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
                         env->migration_type = migrate_task;
                         lsub_positive(&nr_diff, local->sum_nr_running);
                         env->imbalance = nr_diff >> 1;
-                       return;
-               }
+               } else {
  
-               /*
-                * If there is no overload, we just want to even the number of
-                * idle cpus.
-                */
-               env->migration_type = migrate_task;
-               env->imbalance = max_t(long, 0, (local->idle_cpus -
+                       /*
+                        * If there is no overload, we just want to even the number of
+                        * idle cpus.
+                        */
+                       env->migration_type = migrate_task;
+                       env->imbalance = max_t(long, 0, (local->idle_cpus -
                                                  busiest->idle_cpus) >> 1);
+               }
+
+               /* Consider allowing a small imbalance between NUMA groups */
+               if (env->sd->flags & SD_NUMA) {
+                       unsigned int imbalance_min;
+
+                       /*
+                        * Compute an allowed imbalance based on a simple
+                        * pair of communicating tasks that should remain
+                        * local and ignore them.
+                        *
+                        * NOTE: Generally this would have been based on
+                        * the domain size and this was evaluated. However,
+                        * the benefit is similar across a range of workloads
+                        * and machines but scaling by the domain size adds
+                        * the risk that lower domains have to be rebalanced.
+                        */
+                       imbalance_min = 2;
+                       if (busiest->sum_nr_running <= imbalance_min)
+                               env->imbalance = 0;
+               }
+
                 return;
         }
  
diff --git a/kernel/sched/loadavg.c b/kernel/sched/loadavg.c

index 28a516575c181535d21eebc0fd9625983a9eafd9..de22da666ac7392038bdbf18d4c162d867d6f4b6 100644 (file)
--- a/kernel/sched/loadavg.c
+++ b/kernel/sched/loadavg.c
@@ -231,16 +231,11 @@ static inline int calc_load_read_idx(void)
         return calc_load_idx & 1;
  }
  
-void calc_load_nohz_start(void)
+static void calc_load_nohz_fold(struct rq *rq)
  {
-       struct rq *this_rq = this_rq();
         long delta;
  
-       /*
-        * We're going into NO_HZ mode, if there's any pending delta, fold it
-        * into the pending NO_HZ delta.
-        */
-       delta = calc_load_fold_active(this_rq, 0);
+       delta = calc_load_fold_active(rq, 0);
         if (delta) {
                 int idx = calc_load_write_idx();
  
@@ -248,6 +243,24 @@ void calc_load_nohz_start(void)
         }
  }
  
+void calc_load_nohz_start(void)
+{
+       /*
+        * We're going into NO_HZ mode, if there's any pending delta, fold it
+        * into the pending NO_HZ delta.
+        */
+       calc_load_nohz_fold(this_rq());
+}
+
+/*
+ * Keep track of the load for NOHZ_FULL, must be called between
+ * calc_load_nohz_{start,stop}().
+ */
+void calc_load_nohz_remote(struct rq *rq)
+{
+       calc_load_nohz_fold(rq);
+}
+
  void calc_load_nohz_stop(void)
  {
         struct rq *this_rq = this_rq();
@@ -268,7 +281,7 @@ void calc_load_nohz_stop(void)
                 this_rq->calc_load_update += LOAD_FREQ;
  }
  
-static long calc_load_nohz_fold(void)
+static long calc_load_nohz_read(void)
  {
         int idx = calc_load_read_idx();
         long delta = 0;
@@ -323,7 +336,7 @@ static void calc_global_nohz(void)
  }
  #else /* !CONFIG_NO_HZ_COMMON */
  
-static inline long calc_load_nohz_fold(void) { return 0; }
+static inline long calc_load_nohz_read(void) { return 0; }
  static inline void calc_global_nohz(void) { }
  
  #endif /* CONFIG_NO_HZ_COMMON */
@@ -346,7 +359,7 @@ void calc_global_load(unsigned long ticks)
         /*
          * Fold the 'old' NO_HZ-delta to include all NO_HZ CPUs.
          */
-       delta = calc_load_nohz_fold();
+       delta = calc_load_nohz_read();
         if (delta)
                 atomic_long_add(delta, &calc_load_tasks);
  
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c

index ac4bd0ca11cce3149d8e5a76020e339c725edad3..028520702717713286874d34147bf41f21207be2 100644 (file)
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -1199,6 +1199,9 @@ static ssize_t psi_write(struct file *file, const char __user *user_buf,
         if (static_branch_likely(&psi_disabled))
                 return -EOPNOTSUPP;
  
+       if (!nbytes)
+               return -EINVAL;
+
         buf_size = min(nbytes, sizeof(buf));
         if (copy_from_user(buf, user_buf, buf_size))
                 return -EFAULT;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h

index 1a88dc8ad11b71266480a1ef260cc522fee099ee..9ea647835fd6f37abb88197b25f2c5cf7dcf8424 100644 (file)
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -896,7 +896,7 @@ struct rq {
          */
         unsigned long           nr_uninterruptible;
  
-       struct task_struct      *curr;
+       struct task_struct __rcu        *curr;
         struct task_struct      *idle;
         struct task_struct      *stop;
         unsigned long           next_balance;
@@ -2479,3 +2479,16 @@ static inline void membarrier_switch_mm(struct rq *rq,
  {
  }
  #endif
+
+#ifdef CONFIG_SMP
+static inline bool is_per_cpu_kthread(struct task_struct *p)
+{
+       if (!(p->flags & PF_KTHREAD))
+               return false;
+
+       if (p->nr_cpus_allowed != 1)
+               return false;
+
+       return true;
+}
+#endif
diff --git a/kernel/signal.c b/kernel/signal.c

index 9ad8dea93dbb23482d18b24c3be220cb775f80eb..5b2396350dd183cc3ce04d115763cc590d5e3cc0 100644 (file)
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -413,27 +413,32 @@ __sigqueue_alloc(int sig, struct task_struct *t, gfp_t flags, int override_rlimi
  {
         struct sigqueue *q = NULL;
         struct user_struct *user;
+       int sigpending;
  
         /*
          * Protect access to @t credentials. This can go away when all
          * callers hold rcu read lock.
+        *
+        * NOTE! A pending signal will hold on to the user refcount,
+        * and we get/put the refcount only when the sigpending count
+        * changes from/to zero.
          */
         rcu_read_lock();
-       user = get_uid(__task_cred(t)->user);
-       atomic_inc(&user->sigpending);
+       user = __task_cred(t)->user;
+       sigpending = atomic_inc_return(&user->sigpending);
+       if (sigpending == 1)
+               get_uid(user);
         rcu_read_unlock();
  
-       if (override_rlimit ||
-           atomic_read(&user->sigpending) <=
-                       task_rlimit(t, RLIMIT_SIGPENDING)) {
+       if (override_rlimit || likely(sigpending <= task_rlimit(t, RLIMIT_SIGPENDING))) {
                 q = kmem_cache_alloc(sigqueue_cachep, flags);
         } else {
                 print_dropped_signal(sig);
         }
  
         if (unlikely(q == NULL)) {
-               atomic_dec(&user->sigpending);
-               free_uid(user);
+               if (atomic_dec_and_test(&user->sigpending))
+                       free_uid(user);
         } else {
                 INIT_LIST_HEAD(&q->list);
                 q->flags = 0;
@@ -447,8 +452,8 @@ static void __sigqueue_free(struct sigqueue *q)
  {
         if (q->flags & SIGQUEUE_PREALLOC)
                 return;
-       atomic_dec(&q->user->sigpending);
-       free_uid(q->user);
+       if (atomic_dec_and_test(&q->user->sigpending))
+               free_uid(q->user);
         kmem_cache_free(sigqueue_cachep, q);
  }
  
diff --git a/kernel/sysctl.c b/kernel/sysctl.c

index d396aaaf19a329203e4e03dfaa3f33fffa93e0f1..ad5b88a53c5a87528c77bb1f2e8cac25de2e75d3 100644 (file)
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -805,15 +805,6 @@ static struct ctl_table kern_table[] = {
                 .extra2         = &maxolduid,
         },
  #ifdef CONFIG_S390
-#ifdef CONFIG_MATHEMU
-       {
-               .procname       = "ieee_emulation_warnings",
-               .data           = &sysctl_ieee_emulation_warnings,
-               .maxlen         = sizeof(int),
-               .mode           = 0644,
-               .proc_handler   = proc_dointvec,
-       },
-#endif
         {
                 .procname       = "userprocess_debug",
                 .data           = &show_unhandled_signals,
diff --git a/kernel/time/time.c b/kernel/time/time.c

index cdd7386115ff9de912b4ad405640ee391d9178d9..3985b2b32d083e06acfee3c83f952426660dbe9d 100644 (file)
--- a/kernel/time/time.c
+++ b/kernel/time/time.c
@@ -449,49 +449,6 @@ time64_t mktime64(const unsigned int year0, const unsigned int mon0,
  }
  EXPORT_SYMBOL(mktime64);
  
-/**
- * ns_to_timespec - Convert nanoseconds to timespec
- * @nsec:       the nanoseconds value to be converted
- *
- * Returns the timespec representation of the nsec parameter.
- */
-struct timespec ns_to_timespec(const s64 nsec)
-{
-       struct timespec ts;
-       s32 rem;
-
-       if (!nsec)
-               return (struct timespec) {0, 0};
-
-       ts.tv_sec = div_s64_rem(nsec, NSEC_PER_SEC, &rem);
-       if (unlikely(rem < 0)) {
-               ts.tv_sec--;
-               rem += NSEC_PER_SEC;
-       }
-       ts.tv_nsec = rem;
-
-       return ts;
-}
-EXPORT_SYMBOL(ns_to_timespec);
-
-/**
- * ns_to_timeval - Convert nanoseconds to timeval
- * @nsec:       the nanoseconds value to be converted
- *
- * Returns the timeval representation of the nsec parameter.
- */
-struct timeval ns_to_timeval(const s64 nsec)
-{
-       struct timespec ts = ns_to_timespec(nsec);
-       struct timeval tv;
-
-       tv.tv_sec = ts.tv_sec;
-       tv.tv_usec = (suseconds_t) ts.tv_nsec / 1000;
-
-       return tv;
-}
-EXPORT_SYMBOL(ns_to_timeval);
-
  struct __kernel_old_timeval ns_to_kernel_old_timeval(const s64 nsec)
  {
         struct timespec64 ts = ns_to_timespec64(nsec);
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig

index 91e885194dbce47f7f75bbda244aadcca8672986..402eef84c859ac0b7356ca89f22446b00e0b757e 100644 (file)
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -143,8 +143,8 @@ if FTRACE
  
  config BOOTTIME_TRACING
         bool "Boot-time Tracing support"
-       depends on BOOT_CONFIG && TRACING
-       default y
+       depends on TRACING
+       select BOOT_CONFIG
         help
           Enable developer to setup ftrace subsystem via supplemental
           kernel cmdline at boot time for debugging (tracing) driver
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c

index 0735ae8545d86a21d5595392e5b7805b4d75ad13..4560878f0bac01f78a038c34158c2d91d8eba37c 100644 (file)
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -335,6 +335,7 @@ static void put_probe_ref(void)
  
  static void blk_trace_cleanup(struct blk_trace *bt)
  {
+       synchronize_rcu();
         blk_trace_free(bt);
         put_probe_ref();
  }
@@ -629,8 +630,10 @@ static int compat_blk_trace_setup(struct request_queue *q, char *name,
  static int __blk_trace_startstop(struct request_queue *q, int start)
  {
         int ret;
-       struct blk_trace *bt = q->blk_trace;
+       struct blk_trace *bt;
  
+       bt = rcu_dereference_protected(q->blk_trace,
+                                      lockdep_is_held(&q->blk_trace_mutex));
         if (bt == NULL)
                 return -EINVAL;
  
@@ -740,8 +743,8 @@ int blk_trace_ioctl(struct block_device *bdev, unsigned cmd, char __user *arg)
  void blk_trace_shutdown(struct request_queue *q)
  {
         mutex_lock(&q->blk_trace_mutex);
-
-       if (q->blk_trace) {
+       if (rcu_dereference_protected(q->blk_trace,
+                                     lockdep_is_held(&q->blk_trace_mutex))) {
                 __blk_trace_startstop(q, 0);
                 __blk_trace_remove(q);
         }
@@ -752,8 +755,10 @@ void blk_trace_shutdown(struct request_queue *q)
  #ifdef CONFIG_BLK_CGROUP
  static u64 blk_trace_bio_get_cgid(struct request_queue *q, struct bio *bio)
  {
-       struct blk_trace *bt = q->blk_trace;
+       struct blk_trace *bt;
  
+       /* We don't use the 'bt' value here except as an optimization... */
+       bt = rcu_dereference_protected(q->blk_trace, 1);
         if (!bt || !(blk_tracer_flags.val & TRACE_BLK_OPT_CGROUP))
                 return 0;
  
@@ -796,10 +801,14 @@ blk_trace_request_get_cgid(struct request_queue *q, struct request *rq)
  static void blk_add_trace_rq(struct request *rq, int error,
                              unsigned int nr_bytes, u32 what, u64 cgid)
  {
-       struct blk_trace *bt = rq->q->blk_trace;
+       struct blk_trace *bt;
  
-       if (likely(!bt))
+       rcu_read_lock();
+       bt = rcu_dereference(rq->q->blk_trace);
+       if (likely(!bt)) {
+               rcu_read_unlock();
                 return;
+       }
  
         if (blk_rq_is_passthrough(rq))
                 what |= BLK_TC_ACT(BLK_TC_PC);
@@ -808,6 +817,7 @@ static void blk_add_trace_rq(struct request *rq, int error,
  
         __blk_add_trace(bt, blk_rq_trace_sector(rq), nr_bytes, req_op(rq),
                         rq->cmd_flags, what, error, 0, NULL, cgid);
+       rcu_read_unlock();
  }
  
  static void blk_add_trace_rq_insert(void *ignore,
@@ -853,14 +863,19 @@ static void blk_add_trace_rq_complete(void *ignore, struct request *rq,
  static void blk_add_trace_bio(struct request_queue *q, struct bio *bio,
                               u32 what, int error)
  {
-       struct blk_trace *bt = q->blk_trace;
+       struct blk_trace *bt;
  
-       if (likely(!bt))
+       rcu_read_lock();
+       bt = rcu_dereference(q->blk_trace);
+       if (likely(!bt)) {
+               rcu_read_unlock();
                 return;
+       }
  
         __blk_add_trace(bt, bio->bi_iter.bi_sector, bio->bi_iter.bi_size,
                         bio_op(bio), bio->bi_opf, what, error, 0, NULL,
                         blk_trace_bio_get_cgid(q, bio));
+       rcu_read_unlock();
  }
  
  static void blk_add_trace_bio_bounce(void *ignore,
@@ -905,11 +920,14 @@ static void blk_add_trace_getrq(void *ignore,
         if (bio)
                 blk_add_trace_bio(q, bio, BLK_TA_GETRQ, 0);
         else {
-               struct blk_trace *bt = q->blk_trace;
+               struct blk_trace *bt;
  
+               rcu_read_lock();
+               bt = rcu_dereference(q->blk_trace);
                 if (bt)
                         __blk_add_trace(bt, 0, 0, rw, 0, BLK_TA_GETRQ, 0, 0,
                                         NULL, 0);
+               rcu_read_unlock();
         }
  }
  
@@ -921,27 +939,35 @@ static void blk_add_trace_sleeprq(void *ignore,
         if (bio)
                 blk_add_trace_bio(q, bio, BLK_TA_SLEEPRQ, 0);
         else {
-               struct blk_trace *bt = q->blk_trace;
+               struct blk_trace *bt;
  
+               rcu_read_lock();
+               bt = rcu_dereference(q->blk_trace);
                 if (bt)
                         __blk_add_trace(bt, 0, 0, rw, 0, BLK_TA_SLEEPRQ,
                                         0, 0, NULL, 0);
+               rcu_read_unlock();
         }
  }
  
  static void blk_add_trace_plug(void *ignore, struct request_queue *q)
  {
-       struct blk_trace *bt = q->blk_trace;
+       struct blk_trace *bt;
  
+       rcu_read_lock();
+       bt = rcu_dereference(q->blk_trace);
         if (bt)
                 __blk_add_trace(bt, 0, 0, 0, 0, BLK_TA_PLUG, 0, 0, NULL, 0);
+       rcu_read_unlock();
  }
  
  static void blk_add_trace_unplug(void *ignore, struct request_queue *q,
                                     unsigned int depth, bool explicit)
  {
-       struct blk_trace *bt = q->blk_trace;
+       struct blk_trace *bt;
  
+       rcu_read_lock();
+       bt = rcu_dereference(q->blk_trace);
         if (bt) {
                 __be64 rpdu = cpu_to_be64(depth);
                 u32 what;
@@ -953,14 +979,17 @@ static void blk_add_trace_unplug(void *ignore, struct request_queue *q,
  
                 __blk_add_trace(bt, 0, 0, 0, 0, what, 0, sizeof(rpdu), &rpdu, 0);
         }
+       rcu_read_unlock();
  }
  
  static void blk_add_trace_split(void *ignore,
                                 struct request_queue *q, struct bio *bio,
                                 unsigned int pdu)
  {
-       struct blk_trace *bt = q->blk_trace;
+       struct blk_trace *bt;
  
+       rcu_read_lock();
+       bt = rcu_dereference(q->blk_trace);
         if (bt) {
                 __be64 rpdu = cpu_to_be64(pdu);
  
@@ -969,6 +998,7 @@ static void blk_add_trace_split(void *ignore,
                                 BLK_TA_SPLIT, bio->bi_status, sizeof(rpdu),
                                 &rpdu, blk_trace_bio_get_cgid(q, bio));
         }
+       rcu_read_unlock();
  }
  
  /**
@@ -988,11 +1018,15 @@ static void blk_add_trace_bio_remap(void *ignore,
                                     struct request_queue *q, struct bio *bio,
                                     dev_t dev, sector_t from)
  {
-       struct blk_trace *bt = q->blk_trace;
+       struct blk_trace *bt;
         struct blk_io_trace_remap r;
  
-       if (likely(!bt))
+       rcu_read_lock();
+       bt = rcu_dereference(q->blk_trace);
+       if (likely(!bt)) {
+               rcu_read_unlock();
                 return;
+       }
  
         r.device_from = cpu_to_be32(dev);
         r.device_to   = cpu_to_be32(bio_dev(bio));
@@ -1001,6 +1035,7 @@ static void blk_add_trace_bio_remap(void *ignore,
         __blk_add_trace(bt, bio->bi_iter.bi_sector, bio->bi_iter.bi_size,
                         bio_op(bio), bio->bi_opf, BLK_TA_REMAP, bio->bi_status,
                         sizeof(r), &r, blk_trace_bio_get_cgid(q, bio));
+       rcu_read_unlock();
  }
  
  /**
@@ -1021,11 +1056,15 @@ static void blk_add_trace_rq_remap(void *ignore,
                                    struct request *rq, dev_t dev,
                                    sector_t from)
  {
-       struct blk_trace *bt = q->blk_trace;
+       struct blk_trace *bt;
         struct blk_io_trace_remap r;
  
-       if (likely(!bt))
+       rcu_read_lock();
+       bt = rcu_dereference(q->blk_trace);
+       if (likely(!bt)) {
+               rcu_read_unlock();
                 return;
+       }
  
         r.device_from = cpu_to_be32(dev);
         r.device_to   = cpu_to_be32(disk_devt(rq->rq_disk));
@@ -1034,6 +1073,7 @@ static void blk_add_trace_rq_remap(void *ignore,
         __blk_add_trace(bt, blk_rq_pos(rq), blk_rq_bytes(rq),
                         rq_data_dir(rq), 0, BLK_TA_REMAP, 0,
                         sizeof(r), &r, blk_trace_request_get_cgid(q, rq));
+       rcu_read_unlock();
  }
  
  /**
@@ -1051,14 +1091,19 @@ void blk_add_driver_data(struct request_queue *q,
                          struct request *rq,
                          void *data, size_t len)
  {
-       struct blk_trace *bt = q->blk_trace;
+       struct blk_trace *bt;
  
-       if (likely(!bt))
+       rcu_read_lock();
+       bt = rcu_dereference(q->blk_trace);
+       if (likely(!bt)) {
+               rcu_read_unlock();
                 return;
+       }
  
         __blk_add_trace(bt, blk_rq_trace_sector(rq), blk_rq_bytes(rq), 0, 0,
                                 BLK_TA_DRV_DATA, 0, len, data,
                                 blk_trace_request_get_cgid(q, rq));
+       rcu_read_unlock();
  }
  EXPORT_SYMBOL_GPL(blk_add_driver_data);
  
@@ -1597,6 +1642,7 @@ static int blk_trace_remove_queue(struct request_queue *q)
                 return -EINVAL;
  
         put_probe_ref();
+       synchronize_rcu();
         blk_trace_free(bt);
         return 0;
  }
@@ -1758,6 +1804,7 @@ static ssize_t sysfs_blk_trace_attr_show(struct device *dev,
         struct hd_struct *p = dev_to_part(dev);
         struct request_queue *q;
         struct block_device *bdev;
+       struct blk_trace *bt;
         ssize_t ret = -ENXIO;
  
         bdev = bdget(part_devt(p));
@@ -1770,21 +1817,23 @@ static ssize_t sysfs_blk_trace_attr_show(struct device *dev,
  
         mutex_lock(&q->blk_trace_mutex);
  
+       bt = rcu_dereference_protected(q->blk_trace,
+                                      lockdep_is_held(&q->blk_trace_mutex));
         if (attr == &dev_attr_enable) {
-               ret = sprintf(buf, "%u\n", !!q->blk_trace);
+               ret = sprintf(buf, "%u\n", !!bt);
                 goto out_unlock_bdev;
         }
  
-       if (q->blk_trace == NULL)
+       if (bt == NULL)
                 ret = sprintf(buf, "disabled\n");
         else if (attr == &dev_attr_act_mask)
-               ret = blk_trace_mask2str(buf, q->blk_trace->act_mask);
+               ret = blk_trace_mask2str(buf, bt->act_mask);
         else if (attr == &dev_attr_pid)
-               ret = sprintf(buf, "%u\n", q->blk_trace->pid);
+               ret = sprintf(buf, "%u\n", bt->pid);
         else if (attr == &dev_attr_start_lba)
-               ret = sprintf(buf, "%llu\n", q->blk_trace->start_lba);
+               ret = sprintf(buf, "%llu\n", bt->start_lba);
         else if (attr == &dev_attr_end_lba)
-               ret = sprintf(buf, "%llu\n", q->blk_trace->end_lba);
+               ret = sprintf(buf, "%llu\n", bt->end_lba);
  
  out_unlock_bdev:
         mutex_unlock(&q->blk_trace_mutex);
@@ -1801,6 +1850,7 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
         struct block_device *bdev;
         struct request_queue *q;
         struct hd_struct *p;
+       struct blk_trace *bt;
         u64 value;
         ssize_t ret = -EINVAL;
  
@@ -1831,8 +1881,10 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
  
         mutex_lock(&q->blk_trace_mutex);
  
+       bt = rcu_dereference_protected(q->blk_trace,
+                                      lockdep_is_held(&q->blk_trace_mutex));
         if (attr == &dev_attr_enable) {
-               if (!!value == !!q->blk_trace) {
+               if (!!value == !!bt) {
                         ret = 0;
                         goto out_unlock_bdev;
                 }
@@ -1844,18 +1896,18 @@ static ssize_t sysfs_blk_trace_attr_store(struct device *dev,
         }
  
         ret = 0;
-       if (q->blk_trace == NULL)
+       if (bt == NULL)
                 ret = blk_trace_setup_queue(q, bdev);
  
         if (ret == 0) {
                 if (attr == &dev_attr_act_mask)
-                       q->blk_trace->act_mask = value;
+                       bt->act_mask = value;
                 else if (attr == &dev_attr_pid)
-                       q->blk_trace->pid = value;
+                       bt->pid = value;
                 else if (attr == &dev_attr_start_lba)
-                       q->blk_trace->start_lba = value;
+                       bt->start_lba = value;
                 else if (attr == &dev_attr_end_lba)
-                       q->blk_trace->end_lba = value;
+                       bt->end_lba = value;
         }
  
  out_unlock_bdev:
diff --git a/kernel/trace/synth_event_gen_test.c b/kernel/trace/synth_event_gen_test.c

index 4aefe003cb7c3415685f6a9eb31c7f3a3f29fb87..7d56d621ffea879e91ac127475527173dc745380 100644 (file)
--- a/kernel/trace/synth_event_gen_test.c
+++ b/kernel/trace/synth_event_gen_test.c
@@ -111,11 +111,11 @@ static int __init test_gen_synth_cmd(void)
         /* Create some bogus values just for testing */
  
         vals[0] = 777;                  /* next_pid_field */
-       vals[1] = (u64)"hula hoops";    /* next_comm_field */
+       vals[1] = (u64)(long)"hula hoops";      /* next_comm_field */
         vals[2] = 1000000;              /* ts_ns */
         vals[3] = 1000;                 /* ts_ms */
-       vals[4] = smp_processor_id();   /* cpu */
-       vals[5] = (u64)"thneed";        /* my_string_field */
+       vals[4] = raw_smp_processor_id(); /* cpu */
+       vals[5] = (u64)(long)"thneed";  /* my_string_field */
         vals[6] = 598;                  /* my_int_field */
  
         /* Now generate a gen_synth_test event */
@@ -218,11 +218,11 @@ static int __init test_empty_synth_event(void)
         /* Create some bogus values just for testing */
  
         vals[0] = 777;                  /* next_pid_field */
-       vals[1] = (u64)"tiddlywinks";   /* next_comm_field */
+       vals[1] = (u64)(long)"tiddlywinks";     /* next_comm_field */
         vals[2] = 1000000;              /* ts_ns */
         vals[3] = 1000;                 /* ts_ms */
-       vals[4] = smp_processor_id();   /* cpu */
-       vals[5] = (u64)"thneed_2.0";    /* my_string_field */
+       vals[4] = raw_smp_processor_id(); /* cpu */
+       vals[5] = (u64)(long)"thneed_2.0";      /* my_string_field */
         vals[6] = 399;                  /* my_int_field */
  
         /* Now trace an empty_synth_test event */
@@ -290,11 +290,11 @@ static int __init test_create_synth_event(void)
         /* Create some bogus values just for testing */
  
         vals[0] = 777;                  /* next_pid_field */
-       vals[1] = (u64)"tiddlywinks";   /* next_comm_field */
+       vals[1] = (u64)(long)"tiddlywinks";     /* next_comm_field */
         vals[2] = 1000000;              /* ts_ns */
         vals[3] = 1000;                 /* ts_ms */
-       vals[4] = smp_processor_id();   /* cpu */
-       vals[5] = (u64)"thneed";        /* my_string_field */
+       vals[4] = raw_smp_processor_id(); /* cpu */
+       vals[5] = (u64)(long)"thneed";  /* my_string_field */
         vals[6] = 398;                  /* my_int_field */
  
         /* Now generate a create_synth_test event */
@@ -330,7 +330,7 @@ static int __init test_add_next_synth_val(void)
                 goto out;
  
         /* next_comm_field */
-       ret = synth_event_add_next_val((u64)"slinky", &trace_state);
+       ret = synth_event_add_next_val((u64)(long)"slinky", &trace_state);
         if (ret)
                 goto out;
  
@@ -345,12 +345,12 @@ static int __init test_add_next_synth_val(void)
                 goto out;
  
         /* cpu */
-       ret = synth_event_add_next_val(smp_processor_id(), &trace_state);
+       ret = synth_event_add_next_val(raw_smp_processor_id(), &trace_state);
         if (ret)
                 goto out;
  
         /* my_string_field */
-       ret = synth_event_add_next_val((u64)"thneed_2.01", &trace_state);
+       ret = synth_event_add_next_val((u64)(long)"thneed_2.01", &trace_state);
         if (ret)
                 goto out;
  
@@ -388,7 +388,7 @@ static int __init test_add_synth_val(void)
         if (ret)
                 goto out;
  
-       ret = synth_event_add_val("cpu", smp_processor_id(), &trace_state);
+       ret = synth_event_add_val("cpu", raw_smp_processor_id(), &trace_state);
         if (ret)
                 goto out;
  
@@ -396,12 +396,12 @@ static int __init test_add_synth_val(void)
         if (ret)
                 goto out;
  
-       ret = synth_event_add_val("next_comm_field", (u64)"silly putty",
+       ret = synth_event_add_val("next_comm_field", (u64)(long)"silly putty",
                                   &trace_state);
         if (ret)
                 goto out;
  
-       ret = synth_event_add_val("my_string_field", (u64)"thneed_9",
+       ret = synth_event_add_val("my_string_field", (u64)(long)"thneed_9",
                                   &trace_state);
         if (ret)
                 goto out;
@@ -423,13 +423,13 @@ static int __init test_trace_synth_event(void)
  
         /* Trace some bogus values just for testing */
         ret = synth_event_trace(create_synth_test, 7,   /* number of values */
-                               444,                    /* next_pid_field */
-                               (u64)"clackers",        /* next_comm_field */
-                               1000000,                /* ts_ns */
-                               1000,                   /* ts_ms */
-                               smp_processor_id(),     /* cpu */
-                               (u64)"Thneed",          /* my_string_field */
-                               999);                   /* my_int_field */
+                               (u64)444,               /* next_pid_field */
+                               (u64)(long)"clackers",  /* next_comm_field */
+                               (u64)1000000,           /* ts_ns */
+                               (u64)1000,              /* ts_ms */
+                               (u64)raw_smp_processor_id(), /* cpu */
+                               (u64)(long)"Thneed",    /* my_string_field */
+                               (u64)999);              /* my_int_field */
         return ret;
  }
  
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c

index c797a15a1fc77ef672c0269d3e9a2048f168edd2..6b11e4e2150cea92de1f4c007efe4ea9c0c219e8 100644 (file)
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1837,6 +1837,7 @@ static __init int init_trace_selftests(void)
  
         pr_info("Running postponed tracer tests:\n");
  
+       tracing_selftest_running = true;
         list_for_each_entry_safe(p, n, &postponed_selftests, list) {
                 /* This loop can take minutes when sanitizers are enabled, so
                  * lets make sure we allow RCU processing.
@@ -1859,6 +1860,7 @@ static __init int init_trace_selftests(void)
                 list_del(&p->list);
                 kfree(p);
         }
+       tracing_selftest_running = false;
  
   out:
         mutex_unlock(&trace_types_lock);
diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c

index e7ce7cdac62f3016ae541e0cb4302fd31f5abf41..5f6834a2bf4119ef4a30879cac479e0cbc0e5bdf 100644 (file)
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -821,6 +821,29 @@ static const char *synth_field_fmt(char *type)
         return fmt;
  }
  
+static void print_synth_event_num_val(struct trace_seq *s,
+                                     char *print_fmt, char *name,
+                                     int size, u64 val, char *space)
+{
+       switch (size) {
+       case 1:
+               trace_seq_printf(s, print_fmt, name, (u8)val, space);
+               break;
+
+       case 2:
+               trace_seq_printf(s, print_fmt, name, (u16)val, space);
+               break;
+
+       case 4:
+               trace_seq_printf(s, print_fmt, name, (u32)val, space);
+               break;
+
+       default:
+               trace_seq_printf(s, print_fmt, name, val, space);
+               break;
+       }
+}
+
  static enum print_line_t print_synth_event(struct trace_iterator *iter,
                                            int flags,
                                            struct trace_event *event)
@@ -859,10 +882,13 @@ static enum print_line_t print_synth_event(struct trace_iterator *iter,
                 } else {
                         struct trace_print_flags __flags[] = {
                             __def_gfpflag_names, {-1, NULL} };
+                       char *space = (i == se->n_fields - 1 ? "" : " ");
  
-                       trace_seq_printf(s, print_fmt, se->fields[i]->name,
-                                        entry->fields[n_u64],
-                                        i == se->n_fields - 1 ? "" : " ");
+                       print_synth_event_num_val(s, print_fmt,
+                                                 se->fields[i]->name,
+                                                 se->fields[i]->size,
+                                                 entry->fields[n_u64],
+                                                 space);
  
                         if (strcmp(se->fields[i]->type, "gfp_t") == 0) {
                                 trace_seq_puts(s, " (");
@@ -1798,6 +1824,62 @@ void synth_event_cmd_init(struct dynevent_cmd *cmd, char *buf, int maxlen)
  }
  EXPORT_SYMBOL_GPL(synth_event_cmd_init);
  
+static inline int
+__synth_event_trace_start(struct trace_event_file *file,
+                         struct synth_event_trace_state *trace_state)
+{
+       int entry_size, fields_size = 0;
+       int ret = 0;
+
+       memset(trace_state, '\0', sizeof(*trace_state));
+
+       /*
+        * Normal event tracing doesn't get called at all unless the
+        * ENABLED bit is set (which attaches the probe thus allowing
+        * this code to be called, etc).  Because this is called
+        * directly by the user, we don't have that but we still need
+        * to honor not logging when disabled.  For the the iterated
+        * trace case, we save the enabed state upon start and just
+        * ignore the following data calls.
+        */
+       if (!(file->flags & EVENT_FILE_FL_ENABLED) ||
+           trace_trigger_soft_disabled(file)) {
+               trace_state->disabled = true;
+               ret = -ENOENT;
+               goto out;
+       }
+
+       trace_state->event = file->event_call->data;
+
+       fields_size = trace_state->event->n_u64 * sizeof(u64);
+
+       /*
+        * Avoid ring buffer recursion detection, as this event
+        * is being performed within another event.
+        */
+       trace_state->buffer = file->tr->array_buffer.buffer;
+       ring_buffer_nest_start(trace_state->buffer);
+
+       entry_size = sizeof(*trace_state->entry) + fields_size;
+       trace_state->entry = trace_event_buffer_reserve(&trace_state->fbuffer,
+                                                       file,
+                                                       entry_size);
+       if (!trace_state->entry) {
+               ring_buffer_nest_end(trace_state->buffer);
+               ret = -EINVAL;
+       }
+out:
+       return ret;
+}
+
+static inline void
+__synth_event_trace_end(struct synth_event_trace_state *trace_state)
+{
+       trace_event_buffer_commit(&trace_state->fbuffer);
+
+       ring_buffer_nest_end(trace_state->buffer);
+}
+
  /**
   * synth_event_trace - Trace a synthetic event
   * @file: The trace_event_file representing the synthetic event
@@ -1819,71 +1901,61 @@ EXPORT_SYMBOL_GPL(synth_event_cmd_init);
   */
  int synth_event_trace(struct trace_event_file *file, unsigned int n_vals, ...)
  {
-       struct trace_event_buffer fbuffer;
-       struct synth_trace_event *entry;
-       struct trace_buffer *buffer;
-       struct synth_event *event;
+       struct synth_event_trace_state state;
         unsigned int i, n_u64;
-       int fields_size = 0;
         va_list args;
-       int ret = 0;
-
-       /*
-        * Normal event generation doesn't get called at all unless
-        * the ENABLED bit is set (which attaches the probe thus
-        * allowing this code to be called, etc).  Because this is
-        * called directly by the user, we don't have that but we
-        * still need to honor not logging when disabled.
-        */
-       if (!(file->flags & EVENT_FILE_FL_ENABLED))
-               return 0;
-
-       event = file->event_call->data;
-
-       if (n_vals != event->n_fields)
-               return -EINVAL;
-
-       if (trace_trigger_soft_disabled(file))
-               return -EINVAL;
-
-       fields_size = event->n_u64 * sizeof(u64);
+       int ret;
  
-       /*
-        * Avoid ring buffer recursion detection, as this event
-        * is being performed within another event.
-        */
-       buffer = file->tr->array_buffer.buffer;
-       ring_buffer_nest_start(buffer);
+       ret = __synth_event_trace_start(file, &state);
+       if (ret) {
+               if (ret == -ENOENT)
+                       ret = 0; /* just disabled, not really an error */
+               return ret;
+       }
  
-       entry = trace_event_buffer_reserve(&fbuffer, file,
-                                          sizeof(*entry) + fields_size);
-       if (!entry) {
+       if (n_vals != state.event->n_fields) {
                 ret = -EINVAL;
                 goto out;
         }
  
         va_start(args, n_vals);
-       for (i = 0, n_u64 = 0; i < event->n_fields; i++) {
+       for (i = 0, n_u64 = 0; i < state.event->n_fields; i++) {
                 u64 val;
  
                 val = va_arg(args, u64);
  
-               if (event->fields[i]->is_string) {
+               if (state.event->fields[i]->is_string) {
                         char *str_val = (char *)(long)val;
-                       char *str_field = (char *)&entry->fields[n_u64];
+                       char *str_field = (char *)&state.entry->fields[n_u64];
  
                         strscpy(str_field, str_val, STR_VAR_LEN_MAX);
                         n_u64 += STR_VAR_LEN_MAX / sizeof(u64);
                 } else {
-                       entry->fields[n_u64] = val;
+                       struct synth_field *field = state.event->fields[i];
+
+                       switch (field->size) {
+                       case 1:
+                               *(u8 *)&state.entry->fields[n_u64] = (u8)val;
+                               break;
+
+                       case 2:
+                               *(u16 *)&state.entry->fields[n_u64] = (u16)val;
+                               break;
+
+                       case 4:
+                               *(u32 *)&state.entry->fields[n_u64] = (u32)val;
+                               break;
+
+                       default:
+                               state.entry->fields[n_u64] = val;
+                               break;
+                       }
                         n_u64++;
                 }
         }
         va_end(args);
-
-       trace_event_buffer_commit(&fbuffer);
  out:
-       ring_buffer_nest_end(buffer);
+       __synth_event_trace_end(&state);
  
         return ret;
  }
@@ -1910,64 +1982,55 @@ EXPORT_SYMBOL_GPL(synth_event_trace);
  int synth_event_trace_array(struct trace_event_file *file, u64 *vals,
                             unsigned int n_vals)
  {
-       struct trace_event_buffer fbuffer;
-       struct synth_trace_event *entry;
-       struct trace_buffer *buffer;
-       struct synth_event *event;
+       struct synth_event_trace_state state;
         unsigned int i, n_u64;
-       int fields_size = 0;
-       int ret = 0;
-
-       /*
-        * Normal event generation doesn't get called at all unless
-        * the ENABLED bit is set (which attaches the probe thus
-        * allowing this code to be called, etc).  Because this is
-        * called directly by the user, we don't have that but we
-        * still need to honor not logging when disabled.
-        */
-       if (!(file->flags & EVENT_FILE_FL_ENABLED))
-               return 0;
-
-       event = file->event_call->data;
-
-       if (n_vals != event->n_fields)
-               return -EINVAL;
-
-       if (trace_trigger_soft_disabled(file))
-               return -EINVAL;
-
-       fields_size = event->n_u64 * sizeof(u64);
+       int ret;
  
-       /*
-        * Avoid ring buffer recursion detection, as this event
-        * is being performed within another event.
-        */
-       buffer = file->tr->array_buffer.buffer;
-       ring_buffer_nest_start(buffer);
+       ret = __synth_event_trace_start(file, &state);
+       if (ret) {
+               if (ret == -ENOENT)
+                       ret = 0; /* just disabled, not really an error */
+               return ret;
+       }
  
-       entry = trace_event_buffer_reserve(&fbuffer, file,
-                                          sizeof(*entry) + fields_size);
-       if (!entry) {
+       if (n_vals != state.event->n_fields) {
                 ret = -EINVAL;
                 goto out;
         }
  
-       for (i = 0, n_u64 = 0; i < event->n_fields; i++) {
-               if (event->fields[i]->is_string) {
+       for (i = 0, n_u64 = 0; i < state.event->n_fields; i++) {
+               if (state.event->fields[i]->is_string) {
                         char *str_val = (char *)(long)vals[i];
-                       char *str_field = (char *)&entry->fields[n_u64];
+                       char *str_field = (char *)&state.entry->fields[n_u64];
  
                         strscpy(str_field, str_val, STR_VAR_LEN_MAX);
                         n_u64 += STR_VAR_LEN_MAX / sizeof(u64);
                 } else {
-                       entry->fields[n_u64] = vals[i];
+                       struct synth_field *field = state.event->fields[i];
+                       u64 val = vals[i];
+
+                       switch (field->size) {
+                       case 1:
+                               *(u8 *)&state.entry->fields[n_u64] = (u8)val;
+                               break;
+
+                       case 2:
+                               *(u16 *)&state.entry->fields[n_u64] = (u16)val;
+                               break;
+
+                       case 4:
+                               *(u32 *)&state.entry->fields[n_u64] = (u32)val;
+                               break;
+
+                       default:
+                               state.entry->fields[n_u64] = val;
+                               break;
+                       }
                         n_u64++;
                 }
         }
-
-       trace_event_buffer_commit(&fbuffer);
  out:
-       ring_buffer_nest_end(buffer);
+       __synth_event_trace_end(&state);
  
         return ret;
  }
@@ -2004,58 +2067,15 @@ EXPORT_SYMBOL_GPL(synth_event_trace_array);
  int synth_event_trace_start(struct trace_event_file *file,
                             struct synth_event_trace_state *trace_state)
  {
-       struct synth_trace_event *entry;
-       int fields_size = 0;
-       int ret = 0;
-
-       if (!trace_state) {
-               ret = -EINVAL;
-               goto out;
-       }
-
-       memset(trace_state, '\0', sizeof(*trace_state));
-
-       /*
-        * Normal event tracing doesn't get called at all unless the
-        * ENABLED bit is set (which attaches the probe thus allowing
-        * this code to be called, etc).  Because this is called
-        * directly by the user, we don't have that but we still need
-        * to honor not logging when disabled.  For the the iterated
-        * trace case, we save the enabed state upon start and just
-        * ignore the following data calls.
-        */
-       if (!(file->flags & EVENT_FILE_FL_ENABLED)) {
-               trace_state->enabled = false;
-               goto out;
-       }
-
-       trace_state->enabled = true;
-
-       trace_state->event = file->event_call->data;
-
-       if (trace_trigger_soft_disabled(file)) {
-               ret = -EINVAL;
-               goto out;
-       }
-
-       fields_size = trace_state->event->n_u64 * sizeof(u64);
+       int ret;
  
-       /*
-        * Avoid ring buffer recursion detection, as this event
-        * is being performed within another event.
-        */
-       trace_state->buffer = file->tr->array_buffer.buffer;
-       ring_buffer_nest_start(trace_state->buffer);
+       if (!trace_state)
+               return -EINVAL;
  
-       entry = trace_event_buffer_reserve(&trace_state->fbuffer, file,
-                                          sizeof(*entry) + fields_size);
-       if (!entry) {
-               ret = -EINVAL;
-               goto out;
-       }
+       ret = __synth_event_trace_start(file, trace_state);
+       if (ret == -ENOENT)
+               ret = 0; /* just disabled, not really an error */
  
-       trace_state->entry = entry;
-out:
         return ret;
  }
  EXPORT_SYMBOL_GPL(synth_event_trace_start);
@@ -2088,7 +2108,7 @@ static int __synth_event_add_val(const char *field_name, u64 val,
                 trace_state->add_next = true;
         }
  
-       if (!trace_state->enabled)
+       if (trace_state->disabled)
                 goto out;
  
         event = trace_state->event;
@@ -2122,8 +2142,25 @@ static int __synth_event_add_val(const char *field_name, u64 val,
  
                 str_field = (char *)&entry->fields[field->offset];
                 strscpy(str_field, str_val, STR_VAR_LEN_MAX);
-       } else
-               entry->fields[field->offset] = val;
+       } else {
+               switch (field->size) {
+               case 1:
+                       *(u8 *)&trace_state->entry->fields[field->offset] = (u8)val;
+                       break;
+
+               case 2:
+                       *(u16 *)&trace_state->entry->fields[field->offset] = (u16)val;
+                       break;
+
+               case 4:
+                       *(u32 *)&trace_state->entry->fields[field->offset] = (u32)val;
+                       break;
+
+               default:
+                       trace_state->entry->fields[field->offset] = val;
+                       break;
+               }
+       }
   out:
         return ret;
  }
@@ -2223,9 +2260,7 @@ int synth_event_trace_end(struct synth_event_trace_state *trace_state)
         if (!trace_state)
                 return -EINVAL;
  
-       trace_event_buffer_commit(&trace_state->fbuffer);
-
-       ring_buffer_nest_end(trace_state->buffer);
+       __synth_event_trace_end(trace_state);
  
         return 0;
  }
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c

index d8264ebb95814a53cdfd1031b03b8e869a2c5f00..362cca52f5de2dba861376c0e5c286d66c5ceb4e 100644 (file)
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1012,7 +1012,7 @@ int __kprobe_event_add_fields(struct dynevent_cmd *cmd, ...)
  {
         struct dynevent_arg arg;
         va_list args;
-       int ret;
+       int ret = 0;
  
         if (cmd->type != DYNEVENT_TYPE_KPROBE)
                 return -EINVAL;
diff --git a/lib/Kconfig b/lib/Kconfig

index 0cf875fd627c062149c1ec4e34e181d0b34357bf..bc7e56370129257a7c459cccc68741522ed94528 100644 (file)
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -573,9 +573,6 @@ config DIMLIB
  config LIBFDT
         bool
  
-config LIBXBC
-       bool
-
  config OID_REGISTRY
         tristate
         help
diff --git a/lib/Makefile b/lib/Makefile

index 5d64890d6b6a25e6d9d7c303608c8a8828ebb225..611872c0692693f8e6c5048f1e5902ecdf429511 100644 (file)
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -230,7 +230,7 @@ $(foreach file, $(libfdt_files), \
         $(eval CFLAGS_$(file) = -I $(srctree)/scripts/dtc/libfdt))
  lib-$(CONFIG_LIBFDT) += $(libfdt_files)
  
-lib-$(CONFIG_LIBXBC) += bootconfig.o
+lib-$(CONFIG_BOOT_CONFIG) += bootconfig.o
  
  obj-$(CONFIG_RBTREE_TEST) += rbtree_test.o
  obj-$(CONFIG_INTERVAL_TREE_TEST) += interval_tree_test.o
diff --git a/lib/bootconfig.c b/lib/bootconfig.c

index afb2e767e6fe83b7ce42f29aba58c935c6b709cb..ec3ce7fd299f6f36ca3a07897b45bf876e0c2db2 100644 (file)
--- a/lib/bootconfig.c
+++ b/lib/bootconfig.c
@@ -6,12 +6,13 @@
  
  #define pr_fmt(fmt)    "bootconfig: " fmt
  
+#include <linux/bootconfig.h>
  #include <linux/bug.h>
  #include <linux/ctype.h>
  #include <linux/errno.h>
  #include <linux/kernel.h>
+#include <linux/memblock.h>
  #include <linux/printk.h>
-#include <linux/bootconfig.h>
  #include <linux/string.h>
  
  /*
@@ -23,7 +24,7 @@
   * node (for array).
   */
  
-static struct xbc_node xbc_nodes[XBC_NODE_MAX] __initdata;
+static struct xbc_node *xbc_nodes __initdata;
  static int xbc_node_num __initdata;
  static char *xbc_data __initdata;
  static size_t xbc_data_size __initdata;
@@ -532,7 +533,7 @@ struct xbc_node *find_match_node(struct xbc_node *node, char *k)
  
  static int __init __xbc_add_key(char *k)
  {
-       struct xbc_node *node;
+       struct xbc_node *node, *child;
  
         if (!xbc_valid_keyword(k))
                 return xbc_parse_error("Invalid keyword", k);
@@ -542,8 +543,12 @@ static int __init __xbc_add_key(char *k)
  
         if (!last_parent)       /* the first level */
                 node = find_match_node(xbc_nodes, k);
-       else
-               node = find_match_node(xbc_node_get_child(last_parent), k);
+       else {
+               child = xbc_node_get_child(last_parent);
+               if (child && xbc_node_is_value(child))
+                       return xbc_parse_error("Subkey is mixed with value", k);
+               node = find_match_node(child, k);
+       }
  
         if (node)
                 last_parent = node;
@@ -573,10 +578,10 @@ static int __init __xbc_parse_keys(char *k)
         return __xbc_add_key(k);
  }
  
-static int __init xbc_parse_kv(char **k, char *v)
+static int __init xbc_parse_kv(char **k, char *v, int op)
  {
         struct xbc_node *prev_parent = last_parent;
-       struct xbc_node *node;
+       struct xbc_node *child;
         char *next;
         int c, ret;
  
@@ -584,12 +589,19 @@ static int __init xbc_parse_kv(char **k, char *v)
         if (ret)
                 return ret;
  
+       child = xbc_node_get_child(last_parent);
+       if (child) {
+               if (xbc_node_is_key(child))
+                       return xbc_parse_error("Value is mixed with subkey", v);
+               else if (op == '=')
+                       return xbc_parse_error("Value is redefined", v);
+       }
+
         c = __xbc_parse_value(&v, &next);
         if (c < 0)
                 return c;
  
-       node = xbc_add_sibling(v, XBC_VALUE);
-       if (!node)
+       if (!xbc_add_sibling(v, XBC_VALUE))
                 return -ENOMEM;
  
         if (c == ',') { /* Array */
@@ -719,7 +731,8 @@ void __init xbc_destroy_all(void)
         xbc_data = NULL;
         xbc_data_size = 0;
         xbc_node_num = 0;
-       memset(xbc_nodes, 0, sizeof(xbc_nodes));
+       memblock_free(__pa(xbc_nodes), sizeof(struct xbc_node) * XBC_NODE_MAX);
+       xbc_nodes = NULL;
  }
  
  /**
@@ -748,13 +761,20 @@ int __init xbc_init(char *buf)
                 return -ERANGE;
         }
  
+       xbc_nodes = memblock_alloc(sizeof(struct xbc_node) * XBC_NODE_MAX,
+                                  SMP_CACHE_BYTES);
+       if (!xbc_nodes) {
+               pr_err("Failed to allocate memory for bootconfig nodes.\n");
+               return -ENOMEM;
+       }
+       memset(xbc_nodes, 0, sizeof(struct xbc_node) * XBC_NODE_MAX);
         xbc_data = buf;
         xbc_data_size = ret + 1;
         last_parent = NULL;
  
         p = buf;
         do {
-               q = strpbrk(p, "{}=;\n#");
+               q = strpbrk(p, "{}=+;\n#");
                 if (!q) {
                         p = skip_spaces(p);
                         if (*p != '\0')
@@ -765,8 +785,15 @@ int __init xbc_init(char *buf)
                 c = *q;
                 *q++ = '\0';
                 switch (c) {
+               case '+':
+                       if (*q++ != '=') {
+                               ret = xbc_parse_error("Wrong '+' operator",
+                                                       q - 2);
+                               break;
+                       }
+                       /* Fall through */
                 case '=':
-                       ret = xbc_parse_kv(&p, q);
+                       ret = xbc_parse_kv(&p, q, c);
                         break;
                 case '{':
                         ret = xbc_open_brace(&p, q);
diff --git a/lib/crypto/chacha20poly1305.c b/lib/crypto/chacha20poly1305.c

index 6d83cafebc69ce06b3b37b96f665f3537a0748b5..ad0699ce702f954bbae3aa899b7acda21a289fd2 100644 (file)
--- a/lib/crypto/chacha20poly1305.c
+++ b/lib/crypto/chacha20poly1305.c
@@ -235,6 +235,9 @@ bool chacha20poly1305_crypt_sg_inplace(struct scatterlist *src,
                 __le64 lens[2];
         } b __aligned(16);
  
+       if (WARN_ON(src_len > INT_MAX))
+               return false;
+
         chacha_load_key(b.k, key);
  
         b.iv[0] = 0;
diff --git a/lib/stackdepot.c b/lib/stackdepot.c

index ed717dd08ff37216a7c90a531f23b2eac0e2be1a..81c69c08d1d157189d253a0cbe2a2bbd82cff421 100644 (file)
--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -83,15 +83,19 @@ static bool init_stack_slab(void **prealloc)
                 return true;
         if (stack_slabs[depot_index] == NULL) {
                 stack_slabs[depot_index] = *prealloc;
+               *prealloc = NULL;
         } else {
-               stack_slabs[depot_index + 1] = *prealloc;
+               /* If this is the last depot slab, do not touch the next one. */
+               if (depot_index + 1 < STACK_ALLOC_MAX_SLABS) {
+                       stack_slabs[depot_index + 1] = *prealloc;
+                       *prealloc = NULL;
+               }
                 /*
                  * This smp_store_release pairs with smp_load_acquire() from
                  * |next_slab_inited| above and in stack_depot_save().
                  */
                 smp_store_release(&next_slab_inited, 1);
         }
-       *prealloc = NULL;
         return true;
  }
  
diff --git a/lib/string.c b/lib/string.c

index f607b967d9785b206c135f466ca114367cb4727d..6012c385fb314d810bd8a2ff4f18b9e8a3a517db 100644 (file)
--- a/lib/string.c
+++ b/lib/string.c
@@ -699,6 +699,14 @@ EXPORT_SYMBOL(sysfs_streq);
   * @n:         number of strings in the array or -1 for NULL terminated arrays
   * @string:    string to match with
   *
+ * This routine will look for a string in an array of strings up to the
+ * n-th element in the array or until the first NULL element.
+ *
+ * Historically the value of -1 for @n, was used to search in arrays that
+ * are NULL terminated. However, the function does not make a distinction
+ * when finishing the search: either @n elements have been compared OR
+ * the first NULL element was found.
+ *
   * Return:
   * index of a @string in the @array if matches, or %-EINVAL otherwise.
   */
@@ -727,6 +735,14 @@ EXPORT_SYMBOL(match_string);
   *
   * Returns index of @str in the @array or -EINVAL, just like match_string().
   * Uses sysfs_streq instead of strcmp for matching.
+ *
+ * This routine will look for a string in an array of strings up to the
+ * n-th element in the array or until the first NULL element.
+ *
+ * Historically the value of -1 for @n, was used to search in arrays that
+ * are NULL terminated. However, the function does not make a distinction
+ * when finishing the search: either @n elements have been compared OR
+ * the first NULL element was found.
   */
  int __sysfs_match_string(const char * const *array, size_t n, const char *str)
  {
diff --git a/mm/memcontrol.c b/mm/memcontrol.c

index 6f6dc8712e392e5e958868c6bcc191cee3d1e888..d09776cd6e1041d896aa25e726d55f93e46581ed 100644 (file)
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -409,8 +409,10 @@ int memcg_expand_shrinker_maps(int new_id)
                 if (mem_cgroup_is_root(memcg))
                         continue;
                 ret = memcg_expand_one_shrinker_map(memcg, size, old_size);
-               if (ret)
+               if (ret) {
+                       mem_cgroup_iter_break(NULL, memcg);
                         goto unlock;
+               }
         }
  unlock:
         if (!ret)
diff --git a/mm/mmap.c b/mm/mmap.c

index 6756b8bb00334c023c73d1a8112785d205e5c927..d681a20eb4ea9fc604caeedbc2b396083360462d 100644 (file)
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -195,8 +195,6 @@ SYSCALL_DEFINE1(brk, unsigned long, brk)
         bool downgraded = false;
         LIST_HEAD(uf);
  
-       brk = untagged_addr(brk);
-
         if (down_write_killable(&mm->mmap_sem))
                 return -EINTR;
  
@@ -1557,8 +1555,6 @@ unsigned long ksys_mmap_pgoff(unsigned long addr, unsigned long len,
         struct file *file = NULL;
         unsigned long retval;
  
-       addr = untagged_addr(addr);
-
         if (!(flags & MAP_ANONYMOUS)) {
                 audit_mmap_fd(fd, flags);
                 file = fget(fd);
diff --git a/mm/mremap.c b/mm/mremap.c

index 122938dcec15c9a8359d62ceec5801bd74177b3f..af363063ea23bc45430bf5658ca526dd7e053cb7 100644 (file)
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -607,7 +607,6 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len,
         LIST_HEAD(uf_unmap);
  
         addr = untagged_addr(addr);
-       new_addr = untagged_addr(new_addr);
  
         if (flags & ~(MREMAP_FIXED | MREMAP_MAYMOVE))
                 return ret;
diff --git a/mm/shmem.c b/mm/shmem.c

index c8f7540ef048ebc5b425ed640999acf361819cd7..aad3ba74b0e9d121bd774e795460d5f2f73ac119 100644 (file)
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3386,8 +3386,6 @@ static const struct constant_table shmem_param_enums_huge[] = {
         {"always",      SHMEM_HUGE_ALWAYS },
         {"within_size", SHMEM_HUGE_WITHIN_SIZE },
         {"advise",      SHMEM_HUGE_ADVISE },
-       {"deny",        SHMEM_HUGE_DENY },
-       {"force",       SHMEM_HUGE_FORCE },
         {}
  };
  
diff --git a/mm/sparse.c b/mm/sparse.c

index c184b69460b7bd53198a5e31cfd56bf39b0016e5..596b2a45b100507e95d7c7694698c36283113400 100644 (file)
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -876,7 +876,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn,
          * Poison uninitialized struct pages in order to catch invalid flags
          * combinations.
          */
-       page_init_poison(pfn_to_page(start_pfn), sizeof(struct page) * nr_pages);
+       page_init_poison(memmap, sizeof(struct page) * nr_pages);
  
         ms = __nr_to_section(section_nr);
         set_section_nid(section_nr, nid);
diff --git a/mm/swapfile.c b/mm/swapfile.c

index 2c33ff456ed5e228510e02becb6e722f6e6603f0..b2a2e45c9a36f15dee49301aabb9343fe76c36cd 100644 (file)
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3157,7 +3157,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
         mapping = swap_file->f_mapping;
         inode = mapping->host;
  
-       /* If S_ISREG(inode->i_mode) will do inode_lock(inode); */
+       /* will take i_rwsem; */
         error = claim_swapfile(p, inode);
         if (unlikely(error))
                 goto bad_swap;
diff --git a/mm/vmscan.c b/mm/vmscan.c

index c05eb9efec07f369296756dbd035ed4e59961b98..876370565455e275828d8cde8e8f9cd1819fd759 100644 (file)
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2415,10 +2415,13 @@ out:
                         /*
                          * Scan types proportional to swappiness and
                          * their relative recent reclaim efficiency.
-                        * Make sure we don't miss the last page
-                        * because of a round-off error.
+                        * Make sure we don't miss the last page on
+                        * the offlined memory cgroups because of a
+                        * round-off error.
                          */
-                       scan = DIV64_U64_ROUND_UP(scan * fraction[file],
+                       scan = mem_cgroup_online(memcg) ?
+                              div64_u64(scan * fraction[file], denominator) :
+                              DIV64_U64_ROUND_UP(scan * fraction[file],
                                                   denominator);
                         break;
                 case SCAN_FILE:
diff --git a/net/Kconfig b/net/Kconfig

index b0937a700f018323e5b1e8f0421ee42c825a02d7..2eeb0e55f7c9342dc45f9eb99d438988e1a625dc 100644 (file)
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -189,7 +189,6 @@ config BRIDGE_NETFILTER
         depends on NETFILTER_ADVANCED
         select NETFILTER_FAMILY_BRIDGE
         select SKB_EXTENSIONS
-       default m
         ---help---
           Enabling this option will let arptables resp. iptables see bridged
           ARP resp. IP traffic. If you want a bridging firewall, you probably
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c

index dc3d2c1dd9d54ca0a2a9a888b53996bb844b03f9..0e3dbc5f3c34f83203ffafcb944a89fed8043b25 100644 (file)
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -34,7 +34,6 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
         const struct nf_br_ops *nf_ops;
         u8 state = BR_STATE_FORWARDING;
         const unsigned char *dest;
-       struct ethhdr *eth;
         u16 vid = 0;
  
         rcu_read_lock();
@@ -54,15 +53,14 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
         BR_INPUT_SKB_CB(skb)->frag_max_size = 0;
  
         skb_reset_mac_header(skb);
-       eth = eth_hdr(skb);
         skb_pull(skb, ETH_HLEN);
  
         if (!br_allowed_ingress(br, br_vlan_group_rcu(br), skb, &vid, &state))
                 goto out;
  
         if (IS_ENABLED(CONFIG_INET) &&
-           (eth->h_proto == htons(ETH_P_ARP) ||
-            eth->h_proto == htons(ETH_P_RARP)) &&
+           (eth_hdr(skb)->h_proto == htons(ETH_P_ARP) ||
+            eth_hdr(skb)->h_proto == htons(ETH_P_RARP)) &&
             br_opt_get(br, BROPT_NEIGH_SUPPRESS_ENABLED)) {
                 br_do_proxy_suppress_arp(skb, br, vid, NULL);
         } else if (IS_ENABLED(CONFIG_IPV6) &&
diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c

index 6856a6d9282b830fce00cce256d671c17c08638f..1f14b84553457053479d2019f2ec9d8e6e2c41b1 100644 (file)
--- a/net/bridge/br_stp.c
+++ b/net/bridge/br_stp.c
@@ -63,7 +63,8 @@ struct net_bridge_port *br_get_port(struct net_bridge *br, u16 port_no)
  {
         struct net_bridge_port *p;
  
-       list_for_each_entry_rcu(p, &br->port_list, list) {
+       list_for_each_entry_rcu(p, &br->port_list, list,
+                               lockdep_is_held(&br->lock)) {
                 if (p->port_no == port_no)
                         return p;
         }
diff --git a/net/core/dev.c b/net/core/dev.c

index a69e8bd7ed74f1c8c34eab5ffa792e4096761072..c6c985fe7b1bcf784cedde2b2a86e26356471bee 100644 (file)
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -146,7 +146,6 @@
  #include "net-sysfs.h"
  
  #define MAX_GRO_SKBS 8
-#define MAX_NEST_DEV 8
  
  /* This should be increased if a protocol with a bigger head is added. */
  #define GRO_MAX_HEAD (MAX_HEADER + 128)
@@ -331,6 +330,12 @@ int netdev_name_node_alt_destroy(struct net_device *dev, const char *name)
         name_node = netdev_name_node_lookup(net, name);
         if (!name_node)
                 return -ENOENT;
+       /* lookup might have found our primary name or a name belonging
+        * to another device.
+        */
+       if (name_node == dev->name_node || name_node->dev != dev)
+               return -EINVAL;
+
         __netdev_name_node_alt_destroy(name_node);
  
         return 0;
@@ -3071,6 +3076,8 @@ static u16 skb_tx_hash(const struct net_device *dev,
  
         if (skb_rx_queue_recorded(skb)) {
                 hash = skb_get_rx_queue(skb);
+               if (hash >= qoffset)
+                       hash -= qoffset;
                 while (unlikely(hash >= qcount))
                         hash -= qcount;
                 return hash + qoffset;
@@ -3657,26 +3664,8 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
         qdisc_calculate_pkt_len(skb, q);
  
         if (q->flags & TCQ_F_NOLOCK) {
-               if ((q->flags & TCQ_F_CAN_BYPASS) && READ_ONCE(q->empty) &&
-                   qdisc_run_begin(q)) {
-                       if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED,
-                                             &q->state))) {
-                               __qdisc_drop(skb, &to_free);
-                               rc = NET_XMIT_DROP;
-                               goto end_run;
-                       }
-                       qdisc_bstats_cpu_update(q, skb);
-
-                       rc = NET_XMIT_SUCCESS;
-                       if (sch_direct_xmit(skb, q, dev, txq, NULL, true))
-                               __qdisc_run(q);
-
-end_run:
-                       qdisc_run_end(q);
-               } else {
-                       rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK;
-                       qdisc_run(q);
-               }
+               rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK;
+               qdisc_run(q);
  
                 if (unlikely(to_free))
                         kfree_skb_list(to_free);
@@ -4527,14 +4516,14 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb,
         /* Reinjected packets coming from act_mirred or similar should
          * not get XDP generic processing.
          */
-       if (skb_cloned(skb) || skb_is_tc_redirected(skb))
+       if (skb_is_tc_redirected(skb))
                 return XDP_PASS;
  
         /* XDP packets must be linear and must have sufficient headroom
          * of XDP_PACKET_HEADROOM bytes. This is the guarantee that also
          * native XDP provides, thus we need to do it here as well.
          */
-       if (skb_is_nonlinear(skb) ||
+       if (skb_cloned(skb) || skb_is_nonlinear(skb) ||
             skb_headroom(skb) < XDP_PACKET_HEADROOM) {
                 int hroom = XDP_PACKET_HEADROOM - skb_headroom(skb);
                 int troom = skb->tail + skb->data_len - skb->end;
@@ -7201,8 +7190,8 @@ static int __netdev_walk_all_lower_dev(struct net_device *dev,
         return 0;
  }
  
-static struct net_device *netdev_next_lower_dev_rcu(struct net_device *dev,
-                                                   struct list_head **iter)
+struct net_device *netdev_next_lower_dev_rcu(struct net_device *dev,
+                                            struct list_head **iter)
  {
         struct netdev_adjacent *lower;
  
@@ -7214,6 +7203,7 @@ static struct net_device *netdev_next_lower_dev_rcu(struct net_device *dev,
  
         return lower->dev;
  }
+EXPORT_SYMBOL(netdev_next_lower_dev_rcu);
  
  static u8 __netdev_upper_depth(struct net_device *dev)
  {
diff --git a/net/core/devlink.c b/net/core/devlink.c

index 549ee56b7a21b72a9d721c66f04c667db12b822a..5e220809844c81c539d5e7506bc4095bd9a48ff4 100644 (file)
--- a/net/core/devlink.c
+++ b/net/core/devlink.c
@@ -2103,11 +2103,11 @@ err_action_values_put:
  
  static struct devlink_dpipe_table *
  devlink_dpipe_table_find(struct list_head *dpipe_tables,
-                        const char *table_name)
+                        const char *table_name, struct devlink *devlink)
  {
         struct devlink_dpipe_table *table;
-
-       list_for_each_entry_rcu(table, dpipe_tables, list) {
+       list_for_each_entry_rcu(table, dpipe_tables, list,
+                               lockdep_is_held(&devlink->lock)) {
                 if (!strcmp(table->name, table_name))
                         return table;
         }
@@ -2226,7 +2226,7 @@ static int devlink_nl_cmd_dpipe_entries_get(struct sk_buff *skb,
  
         table_name = nla_data(info->attrs[DEVLINK_ATTR_DPIPE_TABLE_NAME]);
         table = devlink_dpipe_table_find(&devlink->dpipe_table_list,
-                                        table_name);
+                                        table_name, devlink);
         if (!table)
                 return -EINVAL;
  
@@ -2382,7 +2382,7 @@ static int devlink_dpipe_table_counters_set(struct devlink *devlink,
         struct devlink_dpipe_table *table;
  
         table = devlink_dpipe_table_find(&devlink->dpipe_table_list,
-                                        table_name);
+                                        table_name, devlink);
         if (!table)
                 return -EINVAL;
  
@@ -6854,7 +6854,7 @@ bool devlink_dpipe_table_counter_enabled(struct devlink *devlink,
  
         rcu_read_lock();
         table = devlink_dpipe_table_find(&devlink->dpipe_table_list,
-                                        table_name);
+                                        table_name, devlink);
         enabled = false;
         if (table)
                 enabled = table->counters_enabled;
@@ -6878,26 +6878,34 @@ int devlink_dpipe_table_register(struct devlink *devlink,
                                  void *priv, bool counter_control_extern)
  {
         struct devlink_dpipe_table *table;
-
-       if (devlink_dpipe_table_find(&devlink->dpipe_table_list, table_name))
-               return -EEXIST;
+       int err = 0;
  
         if (WARN_ON(!table_ops->size_get))
                 return -EINVAL;
  
+       mutex_lock(&devlink->lock);
+
+       if (devlink_dpipe_table_find(&devlink->dpipe_table_list, table_name,
+                                    devlink)) {
+               err = -EEXIST;
+               goto unlock;
+       }
+
         table = kzalloc(sizeof(*table), GFP_KERNEL);
-       if (!table)
-               return -ENOMEM;
+       if (!table) {
+               err = -ENOMEM;
+               goto unlock;
+       }
  
         table->name = table_name;
         table->table_ops = table_ops;
         table->priv = priv;
         table->counter_control_extern = counter_control_extern;
  
-       mutex_lock(&devlink->lock);
         list_add_tail_rcu(&table->list, &devlink->dpipe_table_list);
+unlock:
         mutex_unlock(&devlink->lock);
-       return 0;
+       return err;
  }
  EXPORT_SYMBOL_GPL(devlink_dpipe_table_register);
  
@@ -6914,7 +6922,7 @@ void devlink_dpipe_table_unregister(struct devlink *devlink,
  
         mutex_lock(&devlink->lock);
         table = devlink_dpipe_table_find(&devlink->dpipe_table_list,
-                                        table_name);
+                                        table_name, devlink);
         if (!table)
                 goto unlock;
         list_del_rcu(&table->list);
@@ -7071,7 +7079,7 @@ int devlink_dpipe_table_resource_set(struct devlink *devlink,
  
         mutex_lock(&devlink->lock);
         table = devlink_dpipe_table_find(&devlink->dpipe_table_list,
-                                        table_name);
+                                        table_name, devlink);
         if (!table) {
                 err = -EINVAL;
                 goto out;
diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c

index 3e7e15278c46847e9d6d25ad960d015fbaa8b6fa..bd7eba9066f8d6b39d04be38611535f238bf5cd0 100644 (file)
--- a/net/core/fib_rules.c
+++ b/net/core/fib_rules.c
@@ -974,7 +974,7 @@ static int fib_nl_fill_rule(struct sk_buff *skb, struct fib_rule *rule,
  
         frh = nlmsg_data(nlh);
         frh->family = ops->family;
-       frh->table = rule->table;
+       frh->table = rule->table < 256 ? rule->table : RT_TABLE_COMPAT;
         if (nla_put_u32(skb, FRA_TABLE, rule->table))
                 goto nla_put_failure;
         if (nla_put_u32(skb, FRA_SUPPRESS_PREFIXLEN, rule->suppress_prefixlen))
diff --git a/net/core/page_pool.c b/net/core/page_pool.c

index 9b7cbe35df37d30bdfdd50819a8c30dee66e6e7d..10d2b255df5eccade8feb8f73d8fc657b604e9a4 100644 (file)
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -99,8 +99,7 @@ EXPORT_SYMBOL(page_pool_create);
  static void __page_pool_return_page(struct page_pool *pool, struct page *page);
  
  noinline
-static struct page *page_pool_refill_alloc_cache(struct page_pool *pool,
-                                                bool refill)
+static struct page *page_pool_refill_alloc_cache(struct page_pool *pool)
  {
         struct ptr_ring *r = &pool->ring;
         struct page *page;
@@ -141,8 +140,7 @@ static struct page *page_pool_refill_alloc_cache(struct page_pool *pool,
                         page = NULL;
                         break;
                 }
-       } while (pool->alloc.count < PP_ALLOC_CACHE_REFILL &&
-                refill);
+       } while (pool->alloc.count < PP_ALLOC_CACHE_REFILL);
  
         /* Return last page */
         if (likely(pool->alloc.count > 0))
@@ -155,20 +153,16 @@ static struct page *page_pool_refill_alloc_cache(struct page_pool *pool,
  /* fast path */
  static struct page *__page_pool_get_cached(struct page_pool *pool)
  {
-       bool refill = false;
         struct page *page;
  
-       /* Test for safe-context, caller should provide this guarantee */
-       if (likely(in_serving_softirq())) {
-               if (likely(pool->alloc.count)) {
-                       /* Fast-path */
-                       page = pool->alloc.cache[--pool->alloc.count];
-                       return page;
-               }
-               refill = true;
+       /* Caller MUST guarantee safe non-concurrent access, e.g. softirq */
+       if (likely(pool->alloc.count)) {
+               /* Fast-path */
+               page = pool->alloc.cache[--pool->alloc.count];
+       } else {
+               page = page_pool_refill_alloc_cache(pool);
         }
  
-       page = page_pool_refill_alloc_cache(pool, refill);
         return page;
  }
  
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c

index 09c44bf2e1d28842d77b4ed442ef2c051a25ad21..e1152f4ffe33efb0a69f17a1f5940baa04942e5b 100644 (file)
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -3504,27 +3504,25 @@ static int rtnl_alt_ifname(int cmd, struct net_device *dev, struct nlattr *attr,
         if (err)
                 return err;
  
-       alt_ifname = nla_data(attr);
+       alt_ifname = nla_strdup(attr, GFP_KERNEL);
+       if (!alt_ifname)
+               return -ENOMEM;
+
         if (cmd == RTM_NEWLINKPROP) {
-               alt_ifname = kstrdup(alt_ifname, GFP_KERNEL);
-               if (!alt_ifname)
-                       return -ENOMEM;
                 err = netdev_name_node_alt_create(dev, alt_ifname);
-               if (err) {
-                       kfree(alt_ifname);
-                       return err;
-               }
+               if (!err)
+                       alt_ifname = NULL;
         } else if (cmd == RTM_DELLINKPROP) {
                 err = netdev_name_node_alt_destroy(dev, alt_ifname);
-               if (err)
-                       return err;
         } else {
-               WARN_ON(1);
-               return 0;
+               WARN_ON_ONCE(1);
+               err = -EINVAL;
         }
  
-       *changed = true;
-       return 0;
+       kfree(alt_ifname);
+       if (!err)
+               *changed = true;
+       return err;
  }
  
  static int rtnl_linkprop(int cmd, struct sk_buff *skb, struct nlmsghdr *nlh,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c

index 864cb9e9622f539c14fdeeffd597a6b697e84d26..e1101a4f90a6353038fac59de8a193c91c7d8874 100644 (file)
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -467,7 +467,6 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int len,
                 return NULL;
         }
  
-       /* use OR instead of assignment to avoid clearing of bits in mask */
         if (pfmemalloc)
                 skb->pfmemalloc = 1;
         skb->head_frag = 1;
@@ -527,7 +526,6 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len,
                 return NULL;
         }
  
-       /* use OR instead of assignment to avoid clearing of bits in mask */
         if (nc->page.pfmemalloc)
                 skb->pfmemalloc = 1;
         skb->head_frag = 1;
@@ -4805,9 +4803,9 @@ static __sum16 *skb_checksum_setup_ip(struct sk_buff *skb,
                                       typeof(IPPROTO_IP) proto,
                                       unsigned int off)
  {
-       switch (proto) {
-               int err;
+       int err;
  
+       switch (proto) {
         case IPPROTO_TCP:
                 err = skb_maybe_pull_tail(skb, off + sizeof(struct tcphdr),
                                           off + MAX_TCP_HDR_LEN);
diff --git a/net/dsa/tag_ar9331.c b/net/dsa/tag_ar9331.c

index 466ffa92a4746f0ac0b2860d5af68f3399c35054..55b00694cdba1e75154f4b0f6af0e2f0a14fe9af 100644 (file)
--- a/net/dsa/tag_ar9331.c
+++ b/net/dsa/tag_ar9331.c
@@ -31,7 +31,7 @@ static struct sk_buff *ar9331_tag_xmit(struct sk_buff *skb,
         __le16 *phdr;
         u16 hdr;
  
-       if (skb_cow_head(skb, 0) < 0)
+       if (skb_cow_head(skb, AR9331_HDR_LEN) < 0)
                 return NULL;
  
         phdr = skb_push(skb, AR9331_HDR_LEN);
diff --git a/net/dsa/tag_qca.c b/net/dsa/tag_qca.c

index c8a128c9e5e0f6c8ab048ae6f90aa12e1087cdd6..70db7c909f74efeec7d3b4aa2812fab118bcff3c 100644 (file)
--- a/net/dsa/tag_qca.c
+++ b/net/dsa/tag_qca.c
@@ -33,7 +33,7 @@ static struct sk_buff *qca_tag_xmit(struct sk_buff *skb, struct net_device *dev)
         struct dsa_port *dp = dsa_slave_to_port(dev);
         u16 *phdr, hdr;
  
-       if (skb_cow_head(skb, 0) < 0)
+       if (skb_cow_head(skb, QCA_HDR_LEN) < 0)
                 return NULL;
  
         skb_push(skb, QCA_HDR_LEN);
diff --git a/net/ethtool/bitset.c b/net/ethtool/bitset.c

index fce45dac42056b26a61ca71d527dfb70e3d16b33..ef9197541cb3ba4eadb65aa453df0d29a07747b2 100644 (file)
--- a/net/ethtool/bitset.c
+++ b/net/ethtool/bitset.c
@@ -305,7 +305,8 @@ nla_put_failure:
  static const struct nla_policy bitset_policy[ETHTOOL_A_BITSET_MAX + 1] = {
         [ETHTOOL_A_BITSET_UNSPEC]       = { .type = NLA_REJECT },
         [ETHTOOL_A_BITSET_NOMASK]       = { .type = NLA_FLAG },
-       [ETHTOOL_A_BITSET_SIZE]         = { .type = NLA_U32 },
+       [ETHTOOL_A_BITSET_SIZE]         = NLA_POLICY_MAX(NLA_U32,
+                                                        ETHNL_MAX_BITSET_SIZE),
         [ETHTOOL_A_BITSET_BITS]         = { .type = NLA_NESTED },
         [ETHTOOL_A_BITSET_VALUE]        = { .type = NLA_BINARY },
         [ETHTOOL_A_BITSET_MASK]         = { .type = NLA_BINARY },
@@ -447,7 +448,10 @@ ethnl_update_bitset32_verbose(u32 *bitmap, unsigned int nbits,
                                     "mask only allowed in compact bitset");
                 return -EINVAL;
         }
+
         no_mask = tb[ETHTOOL_A_BITSET_NOMASK];
+       if (no_mask)
+               ethnl_bitmap32_clear(bitmap, 0, nbits, mod);
  
         nla_for_each_nested(bit_attr, tb[ETHTOOL_A_BITSET_BITS], rem) {
                 bool old_val, new_val;
diff --git a/net/ethtool/bitset.h b/net/ethtool/bitset.h

index b8247e34109d0272a96dd3e261b9da473f0a306b..b849f9d1967699c017a1b3e201f4a2b1e33c166c 100644 (file)
--- a/net/ethtool/bitset.h
+++ b/net/ethtool/bitset.h
@@ -3,6 +3,8 @@
  #ifndef _NET_ETHTOOL_BITSET_H
  #define _NET_ETHTOOL_BITSET_H
  
+#define ETHNL_MAX_BITSET_SIZE S16_MAX
+
  typedef const char (*const ethnl_string_array_t)[ETH_GSTRING_LEN];
  
  int ethnl_bitset_is_compact(const struct nlattr *bitset, bool *compact);
diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c

index 364ea2cc028e9a0ccb1a9d409961943cae8e1b3f..3ba7f61be10784a6bb5c814dfe669c320edbc8dd 100644 (file)
--- a/net/hsr/hsr_framereg.c
+++ b/net/hsr/hsr_framereg.c
@@ -155,7 +155,8 @@ static struct hsr_node *hsr_add_node(struct hsr_priv *hsr,
                 new_node->seq_out[i] = seq_out;
  
         spin_lock_bh(&hsr->list_lock);
-       list_for_each_entry_rcu(node, node_db, mac_list) {
+       list_for_each_entry_rcu(node, node_db, mac_list,
+                               lockdep_is_held(&hsr->list_lock)) {
                 if (ether_addr_equal(node->macaddress_A, addr))
                         goto out;
                 if (ether_addr_equal(node->macaddress_B, addr))
diff --git a/net/ipv4/cipso_ipv4.c b/net/ipv4/cipso_ipv4.c

index 3768822159192b919f33e8174003369ca235cde1..0bd10a1f477fdfd6bdc8b6c4a14f132280faedcf 100644 (file)
--- a/net/ipv4/cipso_ipv4.c
+++ b/net/ipv4/cipso_ipv4.c
@@ -1724,6 +1724,7 @@ void cipso_v4_error(struct sk_buff *skb, int error, u32 gateway)
  {
         unsigned char optbuf[sizeof(struct ip_options) + 40];
         struct ip_options *opt = (struct ip_options *)optbuf;
+       int res;
  
         if (ip_hdr(skb)->protocol == IPPROTO_ICMP || error != -EACCES)
                 return;
@@ -1735,7 +1736,11 @@ void cipso_v4_error(struct sk_buff *skb, int error, u32 gateway)
  
         memset(opt, 0, sizeof(struct ip_options));
         opt->optlen = ip_hdr(skb)->ihl*4 - sizeof(struct iphdr);
-       if (__ip_options_compile(dev_net(skb->dev), opt, skb, NULL))
+       rcu_read_lock();
+       res = __ip_options_compile(dev_net(skb->dev), opt, skb, NULL);
+       rcu_read_unlock();
+
+       if (res)
                 return;
  
         if (gateway)
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c

index 18068ed42f258349a07248a85acdf93b8c2f1749..f369e7ce685b8374eae04d0ca0c5508463c5e72f 100644 (file)
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -748,6 +748,39 @@ out:;
  }
  EXPORT_SYMBOL(__icmp_send);
  
+#if IS_ENABLED(CONFIG_NF_NAT)
+#include <net/netfilter/nf_conntrack.h>
+void icmp_ndo_send(struct sk_buff *skb_in, int type, int code, __be32 info)
+{
+       struct sk_buff *cloned_skb = NULL;
+       enum ip_conntrack_info ctinfo;
+       struct nf_conn *ct;
+       __be32 orig_ip;
+
+       ct = nf_ct_get(skb_in, &ctinfo);
+       if (!ct || !(ct->status & IPS_SRC_NAT)) {
+               icmp_send(skb_in, type, code, info);
+               return;
+       }
+
+       if (skb_shared(skb_in))
+               skb_in = cloned_skb = skb_clone(skb_in, GFP_ATOMIC);
+
+       if (unlikely(!skb_in || skb_network_header(skb_in) < skb_in->head ||
+           (skb_network_header(skb_in) + sizeof(struct iphdr)) >
+           skb_tail_pointer(skb_in) || skb_ensure_writable(skb_in,
+           skb_network_offset(skb_in) + sizeof(struct iphdr))))
+               goto out;
+
+       orig_ip = ip_hdr(skb_in)->saddr;
+       ip_hdr(skb_in)->saddr = ct->tuplehash[0].tuple.src.u3.ip;
+       icmp_send(skb_in, type, code, info);
+       ip_hdr(skb_in)->saddr = orig_ip;
+out:
+       consume_skb(cloned_skb);
+}
+EXPORT_SYMBOL(icmp_ndo_send);
+#endif
  
  static void icmp_socket_deliver(struct sk_buff *skb, u32 info)
  {
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c

index 316ebdf8151d6e836fcd68e6063ab3302fae779a..6b6b57000dad879996310717eeb016d2aad46541 100644 (file)
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -6124,7 +6124,11 @@ static void tcp_rcv_synrecv_state_fastopen(struct sock *sk)
  {
         struct request_sock *req;
  
-       tcp_try_undo_loss(sk, false);
+       /* If we are still handling the SYNACK RTO, see if timestamp ECR allows
+        * undo. If peer SACKs triggered fast recovery, we can't undo here.
+        */
+       if (inet_csk(sk)->icsk_ca_state == TCP_CA_Loss)
+               tcp_try_undo_loss(sk, false);
  
         /* Reset rtx states to prevent spurious retransmits_timed_out() */
         tcp_sk(sk)->retrans_stamp = 0;
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c

index db76b96092991cb3f0057035e0e051141511acd5..08a41f1e1cd22478b9ea7740fa34e3979373ac6f 100644 (file)
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1857,8 +1857,12 @@ int __udp_disconnect(struct sock *sk, int flags)
         inet->inet_dport = 0;
         sock_rps_reset_rxhash(sk);
         sk->sk_bound_dev_if = 0;
-       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK))
+       if (!(sk->sk_userlocks & SOCK_BINDADDR_LOCK)) {
                 inet_reset_saddr(sk);
+               if (sk->sk_prot->rehash &&
+                   (sk->sk_userlocks & SOCK_BINDPORT_LOCK))
+                       sk->sk_prot->rehash(sk);
+       }
  
         if (!(sk->sk_userlocks & SOCK_BINDPORT_LOCK)) {
                 sk->sk_prot->unhash(sk);
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c

index 58fbde24438116950c11e7cf8e0ca6f59fec23d9..72abf892302f27fd50179b1b5b6d6e0f974a06b9 100644 (file)
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1102,8 +1102,7 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct fib6_info *rt,
                                         found++;
                                         break;
                                 }
-                               if (rt_can_ecmp)
-                                       fallback_ins = fallback_ins ?: ins;
+                               fallback_ins = fallback_ins ?: ins;
                                 goto next_iter;
                         }
  
@@ -1146,7 +1145,9 @@ next_iter:
         }
  
         if (fallback_ins && !found) {
-               /* No ECMP-able route found, replace first non-ECMP one */
+               /* No matching route with same ecmp-able-ness found, replace
+                * first matching route
+                */
                 ins = fallback_ins;
                 iter = rcu_dereference_protected(*ins,
                                     lockdep_is_held(&rt->fib6_table->tb6_lock));
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c

index 55bfc5149d0c5b5a6063dbdf19c2055a3b9aea93..781ca8c07a0da3619d31f8ddd06c8ae69a7a17eb 100644 (file)
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -437,8 +437,6 @@ static int ip6gre_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
                 return -ENOENT;
  
         switch (type) {
-               struct ipv6_tlv_tnl_enc_lim *tel;
-               __u32 teli;
         case ICMPV6_DEST_UNREACH:
                 net_dbg_ratelimited("%s: Path to destination invalid or inactive!\n",
                                     t->parms.name);
@@ -452,7 +450,10 @@ static int ip6gre_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
                         break;
                 }
                 return 0;
-       case ICMPV6_PARAMPROB:
+       case ICMPV6_PARAMPROB: {
+               struct ipv6_tlv_tnl_enc_lim *tel;
+               __u32 teli;
+
                 teli = 0;
                 if (code == ICMPV6_HDR_FIELD)
                         teli = ip6_tnl_parse_tlv_enc_lim(skb, skb->data);
@@ -468,6 +469,7 @@ static int ip6gre_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
                                             t->parms.name);
                 }
                 return 0;
+       }
         case ICMPV6_PKT_TOOBIG:
                 ip6_update_pmtu(skb, net, info, 0, 0, sock_net_uid(net, NULL));
                 return 0;
diff --git a/net/ipv6/ip6_icmp.c b/net/ipv6/ip6_icmp.c

index 02045494c24cccaa8b50af839aae975695319922..e0086758b6ee3c3e91568e59028838c770fac795 100644 (file)
--- a/net/ipv6/ip6_icmp.c
+++ b/net/ipv6/ip6_icmp.c
@@ -45,4 +45,38 @@ out:
         rcu_read_unlock();
  }
  EXPORT_SYMBOL(icmpv6_send);
+
+#if IS_ENABLED(CONFIG_NF_NAT)
+#include <net/netfilter/nf_conntrack.h>
+void icmpv6_ndo_send(struct sk_buff *skb_in, u8 type, u8 code, __u32 info)
+{
+       struct sk_buff *cloned_skb = NULL;
+       enum ip_conntrack_info ctinfo;
+       struct in6_addr orig_ip;
+       struct nf_conn *ct;
+
+       ct = nf_ct_get(skb_in, &ctinfo);
+       if (!ct || !(ct->status & IPS_SRC_NAT)) {
+               icmpv6_send(skb_in, type, code, info);
+               return;
+       }
+
+       if (skb_shared(skb_in))
+               skb_in = cloned_skb = skb_clone(skb_in, GFP_ATOMIC);
+
+       if (unlikely(!skb_in || skb_network_header(skb_in) < skb_in->head ||
+           (skb_network_header(skb_in) + sizeof(struct ipv6hdr)) >
+           skb_tail_pointer(skb_in) || skb_ensure_writable(skb_in,
+           skb_network_offset(skb_in) + sizeof(struct ipv6hdr))))
+               goto out;
+
+       orig_ip = ipv6_hdr(skb_in)->saddr;
+       ipv6_hdr(skb_in)->saddr = ct->tuplehash[0].tuple.src.u3.in6;
+       icmpv6_send(skb_in, type, code, info);
+       ipv6_hdr(skb_in)->saddr = orig_ip;
+out:
+       consume_skb(cloned_skb);
+}
+EXPORT_SYMBOL(icmpv6_ndo_send);
+#endif
  #endif
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c

index b5dd20c4599bb1fc2515f6a78b8a6ee3287d03e5..4703b09808d0af016ae566a5be5cb490c7179d38 100644 (file)
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -121,6 +121,7 @@ static struct net_device_stats *ip6_get_stats(struct net_device *dev)
  
  /**
   * ip6_tnl_lookup - fetch tunnel matching the end-point addresses
+ *   @link: ifindex of underlying interface
   *   @remote: the address of the tunnel exit-point
   *   @local: the address of the tunnel entry-point
   *
@@ -134,37 +135,56 @@ static struct net_device_stats *ip6_get_stats(struct net_device *dev)
         for (t = rcu_dereference(start); t; t = rcu_dereference(t->next))
  
  static struct ip6_tnl *
-ip6_tnl_lookup(struct net *net, const struct in6_addr *remote, const struct in6_addr *local)
+ip6_tnl_lookup(struct net *net, int link,
+              const struct in6_addr *remote, const struct in6_addr *local)
  {
         unsigned int hash = HASH(remote, local);
-       struct ip6_tnl *t;
+       struct ip6_tnl *t, *cand = NULL;
         struct ip6_tnl_net *ip6n = net_generic(net, ip6_tnl_net_id);
         struct in6_addr any;
  
         for_each_ip6_tunnel_rcu(ip6n->tnls_r_l[hash]) {
-               if (ipv6_addr_equal(local, &t->parms.laddr) &&
-                   ipv6_addr_equal(remote, &t->parms.raddr) &&
-                   (t->dev->flags & IFF_UP))
+               if (!ipv6_addr_equal(local, &t->parms.laddr) ||
+                   !ipv6_addr_equal(remote, &t->parms.raddr) ||
+                   !(t->dev->flags & IFF_UP))
+                       continue;
+
+               if (link == t->parms.link)
                         return t;
+               else
+                       cand = t;
         }
  
         memset(&any, 0, sizeof(any));
         hash = HASH(&any, local);
         for_each_ip6_tunnel_rcu(ip6n->tnls_r_l[hash]) {
-               if (ipv6_addr_equal(local, &t->parms.laddr) &&
-                   ipv6_addr_any(&t->parms.raddr) &&
-                   (t->dev->flags & IFF_UP))
+               if (!ipv6_addr_equal(local, &t->parms.laddr) ||
+                   !ipv6_addr_any(&t->parms.raddr) ||
+                   !(t->dev->flags & IFF_UP))
+                       continue;
+
+               if (link == t->parms.link)
                         return t;
+               else if (!cand)
+                       cand = t;
         }
  
         hash = HASH(remote, &any);
         for_each_ip6_tunnel_rcu(ip6n->tnls_r_l[hash]) {
-               if (ipv6_addr_equal(remote, &t->parms.raddr) &&
-                   ipv6_addr_any(&t->parms.laddr) &&
-                   (t->dev->flags & IFF_UP))
+               if (!ipv6_addr_equal(remote, &t->parms.raddr) ||
+                   !ipv6_addr_any(&t->parms.laddr) ||
+                   !(t->dev->flags & IFF_UP))
+                       continue;
+
+               if (link == t->parms.link)
                         return t;
+               else if (!cand)
+                       cand = t;
         }
  
+       if (cand)
+               return cand;
+
         t = rcu_dereference(ip6n->collect_md_tun);
         if (t && t->dev->flags & IFF_UP)
                 return t;
@@ -351,7 +371,8 @@ static struct ip6_tnl *ip6_tnl_locate(struct net *net,
              (t = rtnl_dereference(*tp)) != NULL;
              tp = &t->next) {
                 if (ipv6_addr_equal(local, &t->parms.laddr) &&
-                   ipv6_addr_equal(remote, &t->parms.raddr)) {
+                   ipv6_addr_equal(remote, &t->parms.raddr) &&
+                   p->link == t->parms.link) {
                         if (create)
                                 return ERR_PTR(-EEXIST);
  
@@ -485,7 +506,7 @@ ip6_tnl_err(struct sk_buff *skb, __u8 ipproto, struct inet6_skb_parm *opt,
            processing of the error. */
  
         rcu_read_lock();
-       t = ip6_tnl_lookup(dev_net(skb->dev), &ipv6h->daddr, &ipv6h->saddr);
+       t = ip6_tnl_lookup(dev_net(skb->dev), skb->dev->ifindex, &ipv6h->daddr, &ipv6h->saddr);
         if (!t)
                 goto out;
  
@@ -496,8 +517,6 @@ ip6_tnl_err(struct sk_buff *skb, __u8 ipproto, struct inet6_skb_parm *opt,
         err = 0;
  
         switch (*type) {
-               struct ipv6_tlv_tnl_enc_lim *tel;
-               __u32 mtu, teli;
         case ICMPV6_DEST_UNREACH:
                 net_dbg_ratelimited("%s: Path to destination invalid or inactive!\n",
                                     t->parms.name);
@@ -510,7 +529,10 @@ ip6_tnl_err(struct sk_buff *skb, __u8 ipproto, struct inet6_skb_parm *opt,
                         rel_msg = 1;
                 }
                 break;
-       case ICMPV6_PARAMPROB:
+       case ICMPV6_PARAMPROB: {
+               struct ipv6_tlv_tnl_enc_lim *tel;
+               __u32 teli;
+
                 teli = 0;
                 if ((*code) == ICMPV6_HDR_FIELD)
                         teli = ip6_tnl_parse_tlv_enc_lim(skb, skb->data);
@@ -527,7 +549,10 @@ ip6_tnl_err(struct sk_buff *skb, __u8 ipproto, struct inet6_skb_parm *opt,
                                             t->parms.name);
                 }
                 break;
-       case ICMPV6_PKT_TOOBIG:
+       }
+       case ICMPV6_PKT_TOOBIG: {
+               __u32 mtu;
+
                 ip6_update_pmtu(skb, net, htonl(*info), 0, 0,
                                 sock_net_uid(net, NULL));
                 mtu = *info - offset;
@@ -541,6 +566,7 @@ ip6_tnl_err(struct sk_buff *skb, __u8 ipproto, struct inet6_skb_parm *opt,
                         rel_msg = 1;
                 }
                 break;
+       }
         case NDISC_REDIRECT:
                 ip6_redirect(skb, net, skb->dev->ifindex, 0,
                              sock_net_uid(net, NULL));
@@ -887,7 +913,7 @@ static int ipxip6_rcv(struct sk_buff *skb, u8 ipproto,
         int ret = -1;
  
         rcu_read_lock();
-       t = ip6_tnl_lookup(dev_net(skb->dev), &ipv6h->saddr, &ipv6h->daddr);
+       t = ip6_tnl_lookup(dev_net(skb->dev), skb->dev->ifindex, &ipv6h->saddr, &ipv6h->daddr);
  
         if (t) {
                 u8 tproto = READ_ONCE(t->parms.proto);
@@ -1420,8 +1446,10 @@ tx_err:
  static void ip6_tnl_link_config(struct ip6_tnl *t)
  {
         struct net_device *dev = t->dev;
+       struct net_device *tdev = NULL;
         struct __ip6_tnl_parm *p = &t->parms;
         struct flowi6 *fl6 = &t->fl.u.ip6;
+       unsigned int mtu;
         int t_hlen;
  
         memcpy(dev->dev_addr, &p->laddr, sizeof(struct in6_addr));
@@ -1457,22 +1485,25 @@ static void ip6_tnl_link_config(struct ip6_tnl *t)
                 struct rt6_info *rt = rt6_lookup(t->net,
                                                  &p->raddr, &p->laddr,
                                                  p->link, NULL, strict);
+               if (rt) {
+                       tdev = rt->dst.dev;
+                       ip6_rt_put(rt);
+               }
  
-               if (!rt)
-                       return;
+               if (!tdev && p->link)
+                       tdev = __dev_get_by_index(t->net, p->link);
  
-               if (rt->dst.dev) {
-                       dev->hard_header_len = rt->dst.dev->hard_header_len +
-                               t_hlen;
+               if (tdev) {
+                       dev->hard_header_len = tdev->hard_header_len + t_hlen;
+                       mtu = min_t(unsigned int, tdev->mtu, IP6_MAX_MTU);
  
-                       dev->mtu = rt->dst.dev->mtu - t_hlen;
+                       dev->mtu = mtu - t_hlen;
                         if (!(t->parms.flags & IP6_TNL_F_IGN_ENCAP_LIMIT))
                                 dev->mtu -= 8;
  
                         if (dev->mtu < IPV6_MIN_MTU)
                                 dev->mtu = IPV6_MIN_MTU;
                 }
-               ip6_rt_put(rt);
         }
  }
  
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c

index 79fc012dd2cae44b69057c168037b018775d1f49..debdaeba5d8c130dbf7dd099bc29fbed80a7ac75 100644 (file)
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -183,9 +183,15 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
                                         retv = -EBUSY;
                                         break;
                                 }
-                       } else if (sk->sk_protocol != IPPROTO_TCP)
+                       } else if (sk->sk_protocol == IPPROTO_TCP) {
+                               if (sk->sk_prot != &tcpv6_prot) {
+                                       retv = -EBUSY;
+                                       break;
+                               }
                                 break;
-
+                       } else {
+                               break;
+                       }
                         if (sk->sk_state != TCP_ESTABLISHED) {
                                 retv = -ENOTCONN;
                                 break;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c

index 4fbdc60b4e070802fcd6905dce800641816a79fa..2931224b674e81e1608cbf5b6a08a31cb6f99415 100644 (file)
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -5198,6 +5198,7 @@ static int ip6_route_multipath_add(struct fib6_config *cfg,
                  */
                 cfg->fc_nlinfo.nlh->nlmsg_flags &= ~(NLM_F_EXCL |
                                                      NLM_F_REPLACE);
+               cfg->fc_nlinfo.nlh->nlmsg_flags |= NLM_F_CREATE;
                 nhn++;
         }
  
diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c

index 000c742d05279a8bebb23b54b061e47669b5a4ae..6aee699deb289bbbaa50eb14b5cfb8a728857666 100644 (file)
--- a/net/mac80211/cfg.c
+++ b/net/mac80211/cfg.c
@@ -3450,7 +3450,7 @@ int ieee80211_attach_ack_skb(struct ieee80211_local *local, struct sk_buff *skb,
  
         spin_lock_irqsave(&local->ack_status_lock, spin_flags);
         id = idr_alloc(&local->ack_status_frames, ack_skb,
-                      1, 0x40, GFP_ATOMIC);
+                      1, 0x2000, GFP_ATOMIC);
         spin_unlock_irqrestore(&local->ack_status_lock, spin_flags);
  
         if (id < 0) {
diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c

index 5fa13176036f4f3d9d9edeff6a903580597e11f0..88d7a692a9658137f06128c7a7c98b2a358d12b2 100644 (file)
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -8,7 +8,7 @@
   * Copyright 2007, Michael Wu <flamingice@sourmilk.net>
   * Copyright 2013-2014  Intel Mobile Communications GmbH
   * Copyright (C) 2015 - 2017 Intel Deutschland GmbH
- * Copyright (C) 2018 - 2019 Intel Corporation
+ * Copyright (C) 2018 - 2020 Intel Corporation
   */
  
  #include <linux/delay.h>
@@ -1311,7 +1311,7 @@ ieee80211_sta_process_chanswitch(struct ieee80211_sub_if_data *sdata,
         if (!res) {
                 ch_switch.timestamp = timestamp;
                 ch_switch.device_timestamp = device_timestamp;
-               ch_switch.block_tx =  beacon ? csa_ie.mode : 0;
+               ch_switch.block_tx = csa_ie.mode;
                 ch_switch.chandef = csa_ie.chandef;
                 ch_switch.count = csa_ie.count;
                 ch_switch.delay = csa_ie.max_switch_time;
@@ -1404,7 +1404,7 @@ ieee80211_sta_process_chanswitch(struct ieee80211_sub_if_data *sdata,
  
         sdata->vif.csa_active = true;
         sdata->csa_chandef = csa_ie.chandef;
-       sdata->csa_block_tx = ch_switch.block_tx;
+       sdata->csa_block_tx = csa_ie.mode;
         ifmgd->csa_ignored_same_chan = false;
  
         if (sdata->csa_block_tx)
@@ -1438,7 +1438,7 @@ ieee80211_sta_process_chanswitch(struct ieee80211_sub_if_data *sdata,
          * reset when the disconnection worker runs.
          */
         sdata->vif.csa_active = true;
-       sdata->csa_block_tx = ch_switch.block_tx;
+       sdata->csa_block_tx = csa_ie.mode;
  
         ieee80211_queue_work(&local->hw, &ifmgd->csa_connection_drop_work);
         mutex_unlock(&local->chanctx_mtx);
@@ -2959,7 +2959,7 @@ static void ieee80211_rx_mgmt_auth(struct ieee80211_sub_if_data *sdata,
             (auth_transaction == 2 &&
              ifmgd->auth_data->expected_transaction == 2)) {
                 if (!ieee80211_mark_sta_auth(sdata, bssid))
-                       goto out_err;
+                       return; /* ignore frame -- wait for timeout */
         } else if (ifmgd->auth_data->algorithm == WLAN_AUTH_SAE &&
                    auth_transaction == 2) {
                 sdata_info(sdata, "SAE peer confirmed\n");
@@ -2967,10 +2967,6 @@ static void ieee80211_rx_mgmt_auth(struct ieee80211_sub_if_data *sdata,
         }
  
         cfg80211_rx_mlme_mgmt(sdata->dev, (u8 *)mgmt, len);
-       return;
- out_err:
-       mutex_unlock(&sdata->local->sta_mtx);
-       /* ignore frame -- wait for timeout */
  }
  
  #define case_WLAN(type) \
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c

index 0e05ff0376726ddbaf80f05d9f90f6f086225b88..0ba98ad9bc854800fad0c90cc594bc6d98b12508 100644 (file)
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -4114,7 +4114,7 @@ void __ieee80211_check_fast_rx_iface(struct ieee80211_sub_if_data *sdata)
  
         lockdep_assert_held(&local->sta_mtx);
  
-       list_for_each_entry_rcu(sta, &local->sta_list, list) {
+       list_for_each_entry(sta, &local->sta_list, list) {
                 if (sdata != sta->sdata &&
                     (!sta->sdata->bss || sta->sdata->bss != sdata->bss))
                         continue;
diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c

index 4bd1faf4f779fb821579f9cea77c32c8e1007a2e..87def9cb91fffb462a47bac7112a513f74a98401 100644 (file)
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -2442,7 +2442,7 @@ static int ieee80211_store_ack_skb(struct ieee80211_local *local,
  
                 spin_lock_irqsave(&local->ack_status_lock, flags);
                 id = idr_alloc(&local->ack_status_frames, ack_skb,
-                              1, 0x40, GFP_ATOMIC);
+                              1, 0x2000, GFP_ATOMIC);
                 spin_unlock_irqrestore(&local->ack_status_lock, flags);
  
                 if (id >= 0) {
diff --git a/net/mac80211/util.c b/net/mac80211/util.c

index 32a7a53833c01d1ea760646440f54009ae9c4544..decd46b3839380d5e0897f3063ceb4e27d625fe2 100644 (file)
--- a/net/mac80211/util.c
+++ b/net/mac80211/util.c
@@ -1063,16 +1063,22 @@ _ieee802_11_parse_elems_crc(const u8 *start, size_t len, bool action,
                                 elem_parse_failed = true;
                         break;
                 case WLAN_EID_VHT_OPERATION:
-                       if (elen >= sizeof(struct ieee80211_vht_operation))
+                       if (elen >= sizeof(struct ieee80211_vht_operation)) {
                                 elems->vht_operation = (void *)pos;
-                       else
-                               elem_parse_failed = true;
+                               if (calc_crc)
+                                       crc = crc32_be(crc, pos - 2, elen + 2);
+                               break;
+                       }
+                       elem_parse_failed = true;
                         break;
                 case WLAN_EID_OPMODE_NOTIF:
-                       if (elen > 0)
+                       if (elen > 0) {
                                 elems->opmode_notif = pos;
-                       else
-                               elem_parse_failed = true;
+                               if (calc_crc)
+                                       crc = crc32_be(crc, pos - 2, elen + 2);
+                               break;
+                       }
+                       elem_parse_failed = true;
                         break;
                 case WLAN_EID_MESH_ID:
                         elems->mesh_id = pos;
@@ -2987,10 +2993,22 @@ bool ieee80211_chandef_vht_oper(struct ieee80211_hw *hw,
         int cf0, cf1;
         int ccfs0, ccfs1, ccfs2;
         int ccf0, ccf1;
+       u32 vht_cap;
+       bool support_80_80 = false;
+       bool support_160 = false;
  
         if (!oper || !htop)
                 return false;
  
+       vht_cap = hw->wiphy->bands[chandef->chan->band]->vht_cap.cap;
+       support_160 = (vht_cap & (IEEE80211_VHT_CAP_SUPP_CHAN_WIDTH_MASK |
+                                 IEEE80211_VHT_CAP_EXT_NSS_BW_MASK));
+       support_80_80 = ((vht_cap &
+                        IEEE80211_VHT_CAP_SUPP_CHAN_WIDTH_160_80PLUS80MHZ) ||
+                       (vht_cap & IEEE80211_VHT_CAP_SUPP_CHAN_WIDTH_160MHZ &&
+                        vht_cap & IEEE80211_VHT_CAP_EXT_NSS_BW_MASK) ||
+                       ((vht_cap & IEEE80211_VHT_CAP_EXT_NSS_BW_MASK) >>
+                                   IEEE80211_VHT_CAP_EXT_NSS_BW_SHIFT > 1));
         ccfs0 = oper->center_freq_seg0_idx;
         ccfs1 = oper->center_freq_seg1_idx;
         ccfs2 = (le16_to_cpu(htop->operation_mode) &
@@ -3018,10 +3036,10 @@ bool ieee80211_chandef_vht_oper(struct ieee80211_hw *hw,
                         unsigned int diff;
  
                         diff = abs(ccf1 - ccf0);
-                       if (diff == 8) {
+                       if ((diff == 8) && support_160) {
                                 new.width = NL80211_CHAN_WIDTH_160;
                                 new.center_freq1 = cf1;
-                       } else if (diff > 8) {
+                       } else if ((diff > 8) && support_80_80) {
                                 new.width = NL80211_CHAN_WIDTH_80P80;
                                 new.center_freq2 = cf1;
                         }
diff --git a/net/mptcp/Kconfig b/net/mptcp/Kconfig

index 49f6054e7f4ebc15837ad7f0c61420063ea00104..a9ed3bf1d93faaf6ca3f2597e2bde6ff3a5f3487 100644 (file)
--- a/net/mptcp/Kconfig
+++ b/net/mptcp/Kconfig
@@ -4,6 +4,7 @@ config MPTCP
         depends on INET
         select SKB_EXTENSIONS
         select CRYPTO_LIB_SHA256
+       select CRYPTO
         help
           Multipath TCP (MPTCP) connections send and receive data over multiple
           subflows in order to utilize multiple network paths. Each subflow
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c

index 73780b4cb10813e203ba9bebad3f6553a49dfa83..3c19a8efdceadcde707150a33bc03c31dc8a0b84 100644 (file)
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -543,6 +543,11 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk,
         }
  }
  
+static unsigned int mptcp_sync_mss(struct sock *sk, u32 pmtu)
+{
+       return 0;
+}
+
  static int __mptcp_init_sock(struct sock *sk)
  {
         struct mptcp_sock *msk = mptcp_sk(sk);
@@ -551,6 +556,7 @@ static int __mptcp_init_sock(struct sock *sk)
         __set_bit(MPTCP_SEND_SPACE, &msk->flags);
  
         msk->first = NULL;
+       inet_csk(sk)->icsk_sync_mss = mptcp_sync_mss;
  
         return 0;
  }
@@ -643,7 +649,7 @@ static struct ipv6_pinfo *mptcp_inet6_sk(const struct sock *sk)
  }
  #endif
  
-struct sock *mptcp_sk_clone_lock(const struct sock *sk)
+static struct sock *mptcp_sk_clone_lock(const struct sock *sk)
  {
         struct sock *nsk = sk_clone_lock(sk, GFP_ATOMIC);
  
@@ -755,60 +761,50 @@ static int mptcp_setsockopt(struct sock *sk, int level, int optname,
                             char __user *optval, unsigned int optlen)
  {
         struct mptcp_sock *msk = mptcp_sk(sk);
-       int ret = -EOPNOTSUPP;
         struct socket *ssock;
-       struct sock *ssk;
  
         pr_debug("msk=%p", msk);
  
         /* @@ the meaning of setsockopt() when the socket is connected and
-        * there are multiple subflows is not defined.
+        * there are multiple subflows is not yet defined. It is up to the
+        * MPTCP-level socket to configure the subflows until the subflow
+        * is in TCP fallback, when TCP socket options are passed through
+        * to the one remaining subflow.
          */
         lock_sock(sk);
-       ssock = __mptcp_socket_create(msk, MPTCP_SAME_STATE);
-       if (IS_ERR(ssock)) {
-               release_sock(sk);
-               return ret;
-       }
+       ssock = __mptcp_tcp_fallback(msk);
+       if (ssock)
+               return tcp_setsockopt(ssock->sk, level, optname, optval,
+                                     optlen);
  
-       ssk = ssock->sk;
-       sock_hold(ssk);
         release_sock(sk);
  
-       ret = tcp_setsockopt(ssk, level, optname, optval, optlen);
-       sock_put(ssk);
-
-       return ret;
+       return -EOPNOTSUPP;
  }
  
  static int mptcp_getsockopt(struct sock *sk, int level, int optname,
                             char __user *optval, int __user *option)
  {
         struct mptcp_sock *msk = mptcp_sk(sk);
-       int ret = -EOPNOTSUPP;
         struct socket *ssock;
-       struct sock *ssk;
  
         pr_debug("msk=%p", msk);
  
-       /* @@ the meaning of getsockopt() when the socket is connected and
-        * there are multiple subflows is not defined.
+       /* @@ the meaning of setsockopt() when the socket is connected and
+        * there are multiple subflows is not yet defined. It is up to the
+        * MPTCP-level socket to configure the subflows until the subflow
+        * is in TCP fallback, when socket options are passed through
+        * to the one remaining subflow.
          */
         lock_sock(sk);
-       ssock = __mptcp_socket_create(msk, MPTCP_SAME_STATE);
-       if (IS_ERR(ssock)) {
-               release_sock(sk);
-               return ret;
-       }
+       ssock = __mptcp_tcp_fallback(msk);
+       if (ssock)
+               return tcp_getsockopt(ssock->sk, level, optname, optval,
+                                     option);
  
-       ssk = ssock->sk;
-       sock_hold(ssk);
         release_sock(sk);
  
-       ret = tcp_getsockopt(ssk, level, optname, optval, option);
-       sock_put(ssk);
-
-       return ret;
+       return -EOPNOTSUPP;
  }
  
  static int mptcp_get_port(struct sock *sk, unsigned short snum)
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h

index 8a99a29302846fdc2b21d225daffa7e44adb067c..9f8663b30456256bfdd0e8e4d89061bc90e5281a 100644 (file)
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -56,8 +56,8 @@
  #define MPTCP_DSS_FLAG_MASK    (0x1F)
  
  /* MPTCP socket flags */
-#define MPTCP_DATA_READY       BIT(0)
-#define MPTCP_SEND_SPACE       BIT(1)
+#define MPTCP_DATA_READY       0
+#define MPTCP_SEND_SPACE       1
  
  /* MPTCP connection sock */
  struct mptcp_sock {
diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c

index 69c107f9ba8db06ebb0b332a7fb91e2514d5831e..8dd17589217d71e4e38f6d442434130cadd290e0 100644 (file)
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -723,6 +723,20 @@ ip_set_rcu_get(struct net *net, ip_set_id_t index)
         return set;
  }
  
+static inline void
+ip_set_lock(struct ip_set *set)
+{
+       if (!set->variant->region_lock)
+               spin_lock_bh(&set->lock);
+}
+
+static inline void
+ip_set_unlock(struct ip_set *set)
+{
+       if (!set->variant->region_lock)
+               spin_unlock_bh(&set->lock);
+}
+
  int
  ip_set_test(ip_set_id_t index, const struct sk_buff *skb,
             const struct xt_action_param *par, struct ip_set_adt_opt *opt)
@@ -744,9 +758,9 @@ ip_set_test(ip_set_id_t index, const struct sk_buff *skb,
         if (ret == -EAGAIN) {
                 /* Type requests element to be completed */
                 pr_debug("element must be completed, ADD is triggered\n");
-               spin_lock_bh(&set->lock);
+               ip_set_lock(set);
                 set->variant->kadt(set, skb, par, IPSET_ADD, opt);
-               spin_unlock_bh(&set->lock);
+               ip_set_unlock(set);
                 ret = 1;
         } else {
                 /* --return-nomatch: invert matched element */
@@ -775,9 +789,9 @@ ip_set_add(ip_set_id_t index, const struct sk_buff *skb,
             !(opt->family == set->family || set->family == NFPROTO_UNSPEC))
                 return -IPSET_ERR_TYPE_MISMATCH;
  
-       spin_lock_bh(&set->lock);
+       ip_set_lock(set);
         ret = set->variant->kadt(set, skb, par, IPSET_ADD, opt);
-       spin_unlock_bh(&set->lock);
+       ip_set_unlock(set);
  
         return ret;
  }
@@ -797,9 +811,9 @@ ip_set_del(ip_set_id_t index, const struct sk_buff *skb,
             !(opt->family == set->family || set->family == NFPROTO_UNSPEC))
                 return -IPSET_ERR_TYPE_MISMATCH;
  
-       spin_lock_bh(&set->lock);
+       ip_set_lock(set);
         ret = set->variant->kadt(set, skb, par, IPSET_DEL, opt);
-       spin_unlock_bh(&set->lock);
+       ip_set_unlock(set);
  
         return ret;
  }
@@ -1264,9 +1278,9 @@ ip_set_flush_set(struct ip_set *set)
  {
         pr_debug("set: %s\n",  set->name);
  
-       spin_lock_bh(&set->lock);
+       ip_set_lock(set);
         set->variant->flush(set);
-       spin_unlock_bh(&set->lock);
+       ip_set_unlock(set);
  }
  
  static int ip_set_flush(struct net *net, struct sock *ctnl, struct sk_buff *skb,
@@ -1713,9 +1727,9 @@ call_ad(struct sock *ctnl, struct sk_buff *skb, struct ip_set *set,
         bool eexist = flags & IPSET_FLAG_EXIST, retried = false;
  
         do {
-               spin_lock_bh(&set->lock);
+               ip_set_lock(set);
                 ret = set->variant->uadt(set, tb, adt, &lineno, flags, retried);
-               spin_unlock_bh(&set->lock);
+               ip_set_unlock(set);
                 retried = true;
         } while (ret == -EAGAIN &&
                  set->variant->resize &&
diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h

index 7480ce55b5c856feba27bc47341ff8158068e11f..e52d7b7597a0d8f3d6cfed5d2af6544ae1ad2e5d 100644 (file)
--- a/net/netfilter/ipset/ip_set_hash_gen.h
+++ b/net/netfilter/ipset/ip_set_hash_gen.h
@@ -7,13 +7,21 @@
  #include <linux/rcupdate.h>
  #include <linux/jhash.h>
  #include <linux/types.h>
+#include <linux/netfilter/nfnetlink.h>
  #include <linux/netfilter/ipset/ip_set.h>
  
-#define __ipset_dereference_protected(p, c)    rcu_dereference_protected(p, c)
-#define ipset_dereference_protected(p, set) \
-       __ipset_dereference_protected(p, lockdep_is_held(&(set)->lock))
-
-#define rcu_dereference_bh_nfnl(p)     rcu_dereference_bh_check(p, 1)
+#define __ipset_dereference(p)         \
+       rcu_dereference_protected(p, 1)
+#define ipset_dereference_nfnl(p)      \
+       rcu_dereference_protected(p,    \
+               lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET))
+#define ipset_dereference_set(p, set)  \
+       rcu_dereference_protected(p,    \
+               lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET) || \
+               lockdep_is_held(&(set)->lock))
+#define ipset_dereference_bh_nfnl(p)   \
+       rcu_dereference_bh_check(p,     \
+               lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET))
  
  /* Hashing which uses arrays to resolve clashing. The hash table is resized
   * (doubled) when searching becomes too long.
@@ -72,11 +80,35 @@ struct hbucket {
                 __aligned(__alignof__(u64));
  };
  
+/* Region size for locking == 2^HTABLE_REGION_BITS */
+#define HTABLE_REGION_BITS     10
+#define ahash_numof_locks(htable_bits)         \
+       ((htable_bits) < HTABLE_REGION_BITS ? 1 \
+               : jhash_size((htable_bits) - HTABLE_REGION_BITS))
+#define ahash_sizeof_regions(htable_bits)              \
+       (ahash_numof_locks(htable_bits) * sizeof(struct ip_set_region))
+#define ahash_region(n, htable_bits)           \
+       ((n) % ahash_numof_locks(htable_bits))
+#define ahash_bucket_start(h,  htable_bits)    \
+       ((htable_bits) < HTABLE_REGION_BITS ? 0 \
+               : (h) * jhash_size(HTABLE_REGION_BITS))
+#define ahash_bucket_end(h,  htable_bits)      \
+       ((htable_bits) < HTABLE_REGION_BITS ? jhash_size(htable_bits)   \
+               : ((h) + 1) * jhash_size(HTABLE_REGION_BITS))
+
+struct htable_gc {
+       struct delayed_work dwork;
+       struct ip_set *set;     /* Set the gc belongs to */
+       u32 region;             /* Last gc run position */
+};
+
  /* The hash table: the table size stored here in order to make resizing easy */
  struct htable {
         atomic_t ref;           /* References for resizing */
-       atomic_t uref;          /* References for dumping */
+       atomic_t uref;          /* References for dumping and gc */
         u8 htable_bits;         /* size of hash table == 2^htable_bits */
+       u32 maxelem;            /* Maxelem per region */
+       struct ip_set_region *hregion;  /* Region locks and ext sizes */
         struct hbucket __rcu *bucket[0]; /* hashtable buckets */
  };
  
@@ -162,6 +194,10 @@ htable_bits(u32 hashsize)
  #define NLEN                   0
  #endif /* IP_SET_HASH_WITH_NETS */
  
+#define SET_ELEM_EXPIRED(set, d)       \
+       (SET_WITH_TIMEOUT(set) &&       \
+        ip_set_timeout_expired(ext_timeout(d, set)))
+
  #endif /* _IP_SET_HASH_GEN_H */
  
  #ifndef MTYPE
@@ -205,10 +241,12 @@ htable_bits(u32 hashsize)
  #undef mtype_test_cidrs
  #undef mtype_test
  #undef mtype_uref
-#undef mtype_expire
  #undef mtype_resize
+#undef mtype_ext_size
+#undef mtype_resize_ad
  #undef mtype_head
  #undef mtype_list
+#undef mtype_gc_do
  #undef mtype_gc
  #undef mtype_gc_init
  #undef mtype_variant
@@ -247,10 +285,12 @@ htable_bits(u32 hashsize)
  #define mtype_test_cidrs       IPSET_TOKEN(MTYPE, _test_cidrs)
  #define mtype_test             IPSET_TOKEN(MTYPE, _test)
  #define mtype_uref             IPSET_TOKEN(MTYPE, _uref)
-#define mtype_expire           IPSET_TOKEN(MTYPE, _expire)
  #define mtype_resize           IPSET_TOKEN(MTYPE, _resize)
+#define mtype_ext_size         IPSET_TOKEN(MTYPE, _ext_size)
+#define mtype_resize_ad                IPSET_TOKEN(MTYPE, _resize_ad)
  #define mtype_head             IPSET_TOKEN(MTYPE, _head)
  #define mtype_list             IPSET_TOKEN(MTYPE, _list)
+#define mtype_gc_do            IPSET_TOKEN(MTYPE, _gc_do)
  #define mtype_gc               IPSET_TOKEN(MTYPE, _gc)
  #define mtype_gc_init          IPSET_TOKEN(MTYPE, _gc_init)
  #define mtype_variant          IPSET_TOKEN(MTYPE, _variant)
@@ -275,8 +315,7 @@ htable_bits(u32 hashsize)
  /* The generic hash structure */
  struct htype {
         struct htable __rcu *table; /* the hash table */
-       struct timer_list gc;   /* garbage collection when timeout enabled */
-       struct ip_set *set;     /* attached to this ip_set */
+       struct htable_gc gc;    /* gc workqueue */
         u32 maxelem;            /* max elements in the hash */
         u32 initval;            /* random jhash init value */
  #ifdef IP_SET_HASH_WITH_MARKMASK
@@ -288,21 +327,33 @@ struct htype {
  #ifdef IP_SET_HASH_WITH_NETMASK
         u8 netmask;             /* netmask value for subnets to store */
  #endif
+       struct list_head ad;    /* Resize add|del backlist */
         struct mtype_elem next; /* temporary storage for uadd */
  #ifdef IP_SET_HASH_WITH_NETS
         struct net_prefixes nets[NLEN]; /* book-keeping of prefixes */
  #endif
  };
  
+/* ADD|DEL entries saved during resize */
+struct mtype_resize_ad {
+       struct list_head list;
+       enum ipset_adt ad;      /* ADD|DEL element */
+       struct mtype_elem d;    /* Element value */
+       struct ip_set_ext ext;  /* Extensions for ADD */
+       struct ip_set_ext mext; /* Target extensions for ADD */
+       u32 flags;              /* Flags for ADD */
+};
+
  #ifdef IP_SET_HASH_WITH_NETS
  /* Network cidr size book keeping when the hash stores different
   * sized networks. cidr == real cidr + 1 to support /0.
   */
  static void
-mtype_add_cidr(struct htype *h, u8 cidr, u8 n)
+mtype_add_cidr(struct ip_set *set, struct htype *h, u8 cidr, u8 n)
  {
         int i, j;
  
+       spin_lock_bh(&set->lock);
         /* Add in increasing prefix order, so larger cidr first */
         for (i = 0, j = -1; i < NLEN && h->nets[i].cidr[n]; i++) {
                 if (j != -1) {
@@ -311,7 +362,7 @@ mtype_add_cidr(struct htype *h, u8 cidr, u8 n)
                         j = i;
                 } else if (h->nets[i].cidr[n] == cidr) {
                         h->nets[CIDR_POS(cidr)].nets[n]++;
-                       return;
+                       goto unlock;
                 }
         }
         if (j != -1) {
@@ -320,24 +371,29 @@ mtype_add_cidr(struct htype *h, u8 cidr, u8 n)
         }
         h->nets[i].cidr[n] = cidr;
         h->nets[CIDR_POS(cidr)].nets[n] = 1;
+unlock:
+       spin_unlock_bh(&set->lock);
  }
  
  static void
-mtype_del_cidr(struct htype *h, u8 cidr, u8 n)
+mtype_del_cidr(struct ip_set *set, struct htype *h, u8 cidr, u8 n)
  {
         u8 i, j, net_end = NLEN - 1;
  
+       spin_lock_bh(&set->lock);
         for (i = 0; i < NLEN; i++) {
                 if (h->nets[i].cidr[n] != cidr)
                         continue;
                 h->nets[CIDR_POS(cidr)].nets[n]--;
                 if (h->nets[CIDR_POS(cidr)].nets[n] > 0)
-                       return;
+                       goto unlock;
                 for (j = i; j < net_end && h->nets[j].cidr[n]; j++)
                         h->nets[j].cidr[n] = h->nets[j + 1].cidr[n];
                 h->nets[j].cidr[n] = 0;
-               return;
+               goto unlock;
         }
+unlock:
+       spin_unlock_bh(&set->lock);
  }
  #endif
  
@@ -345,7 +401,7 @@ mtype_del_cidr(struct htype *h, u8 cidr, u8 n)
  static size_t
  mtype_ahash_memsize(const struct htype *h, const struct htable *t)
  {
-       return sizeof(*h) + sizeof(*t);
+       return sizeof(*h) + sizeof(*t) + ahash_sizeof_regions(t->htable_bits);
  }
  
  /* Get the ith element from the array block n */
@@ -369,24 +425,29 @@ mtype_flush(struct ip_set *set)
         struct htype *h = set->data;
         struct htable *t;
         struct hbucket *n;
-       u32 i;
-
-       t = ipset_dereference_protected(h->table, set);
-       for (i = 0; i < jhash_size(t->htable_bits); i++) {
-               n = __ipset_dereference_protected(hbucket(t, i), 1);
-               if (!n)
-                       continue;
-               if (set->extensions & IPSET_EXT_DESTROY)
-                       mtype_ext_cleanup(set, n);
-               /* FIXME: use slab cache */
-               rcu_assign_pointer(hbucket(t, i), NULL);
-               kfree_rcu(n, rcu);
+       u32 r, i;
+
+       t = ipset_dereference_nfnl(h->table);
+       for (r = 0; r < ahash_numof_locks(t->htable_bits); r++) {
+               spin_lock_bh(&t->hregion[r].lock);
+               for (i = ahash_bucket_start(r, t->htable_bits);
+                    i < ahash_bucket_end(r, t->htable_bits); i++) {
+                       n = __ipset_dereference(hbucket(t, i));
+                       if (!n)
+                               continue;
+                       if (set->extensions & IPSET_EXT_DESTROY)
+                               mtype_ext_cleanup(set, n);
+                       /* FIXME: use slab cache */
+                       rcu_assign_pointer(hbucket(t, i), NULL);
+                       kfree_rcu(n, rcu);
+               }
+               t->hregion[r].ext_size = 0;
+               t->hregion[r].elements = 0;
+               spin_unlock_bh(&t->hregion[r].lock);
         }
  #ifdef IP_SET_HASH_WITH_NETS
         memset(h->nets, 0, sizeof(h->nets));
  #endif
-       set->elements = 0;
-       set->ext_size = 0;
  }
  
  /* Destroy the hashtable part of the set */
@@ -397,7 +458,7 @@ mtype_ahash_destroy(struct ip_set *set, struct htable *t, bool ext_destroy)
         u32 i;
  
         for (i = 0; i < jhash_size(t->htable_bits); i++) {
-               n = __ipset_dereference_protected(hbucket(t, i), 1);
+               n = __ipset_dereference(hbucket(t, i));
                 if (!n)
                         continue;
                 if (set->extensions & IPSET_EXT_DESTROY && ext_destroy)
@@ -406,6 +467,7 @@ mtype_ahash_destroy(struct ip_set *set, struct htable *t, bool ext_destroy)
                 kfree(n);
         }
  
+       ip_set_free(t->hregion);
         ip_set_free(t);
  }
  
@@ -414,28 +476,21 @@ static void
  mtype_destroy(struct ip_set *set)
  {
         struct htype *h = set->data;
+       struct list_head *l, *lt;
  
         if (SET_WITH_TIMEOUT(set))
-               del_timer_sync(&h->gc);
+               cancel_delayed_work_sync(&h->gc.dwork);
  
-       mtype_ahash_destroy(set,
-                           __ipset_dereference_protected(h->table, 1), true);
+       mtype_ahash_destroy(set, ipset_dereference_nfnl(h->table), true);
+       list_for_each_safe(l, lt, &h->ad) {
+               list_del(l);
+               kfree(l);
+       }
         kfree(h);
  
         set->data = NULL;
  }
  
-static void
-mtype_gc_init(struct ip_set *set, void (*gc)(struct timer_list *t))
-{
-       struct htype *h = set->data;
-
-       timer_setup(&h->gc, gc, 0);
-       mod_timer(&h->gc, jiffies + IPSET_GC_PERIOD(set->timeout) * HZ);
-       pr_debug("gc initialized, run in every %u\n",
-                IPSET_GC_PERIOD(set->timeout));
-}
-
  static bool
  mtype_same_set(const struct ip_set *a, const struct ip_set *b)
  {
@@ -454,11 +509,9 @@ mtype_same_set(const struct ip_set *a, const struct ip_set *b)
                a->extensions == b->extensions;
  }
  
-/* Delete expired elements from the hashtable */
  static void
-mtype_expire(struct ip_set *set, struct htype *h)
+mtype_gc_do(struct ip_set *set, struct htype *h, struct htable *t, u32 r)
  {
-       struct htable *t;
         struct hbucket *n, *tmp;
         struct mtype_elem *data;
         u32 i, j, d;
@@ -466,10 +519,12 @@ mtype_expire(struct ip_set *set, struct htype *h)
  #ifdef IP_SET_HASH_WITH_NETS
         u8 k;
  #endif
+       u8 htable_bits = t->htable_bits;
  
-       t = ipset_dereference_protected(h->table, set);
-       for (i = 0; i < jhash_size(t->htable_bits); i++) {
-               n = __ipset_dereference_protected(hbucket(t, i), 1);
+       spin_lock_bh(&t->hregion[r].lock);
+       for (i = ahash_bucket_start(r, htable_bits);
+            i < ahash_bucket_end(r, htable_bits); i++) {
+               n = __ipset_dereference(hbucket(t, i));
                 if (!n)
                         continue;
                 for (j = 0, d = 0; j < n->pos; j++) {
@@ -485,58 +540,100 @@ mtype_expire(struct ip_set *set, struct htype *h)
                         smp_mb__after_atomic();
  #ifdef IP_SET_HASH_WITH_NETS
                         for (k = 0; k < IPSET_NET_COUNT; k++)
-                               mtype_del_cidr(h,
+                               mtype_del_cidr(set, h,
                                         NCIDR_PUT(DCIDR_GET(data->cidr, k)),
                                         k);
  #endif
+                       t->hregion[r].elements--;
                         ip_set_ext_destroy(set, data);
-                       set->elements--;
                         d++;
                 }
                 if (d >= AHASH_INIT_SIZE) {
                         if (d >= n->size) {
+                               t->hregion[r].ext_size -=
+                                       ext_size(n->size, dsize);
                                 rcu_assign_pointer(hbucket(t, i), NULL);
                                 kfree_rcu(n, rcu);
                                 continue;
                         }
                         tmp = kzalloc(sizeof(*tmp) +
-                                     (n->size - AHASH_INIT_SIZE) * dsize,
-                                     GFP_ATOMIC);
+                               (n->size - AHASH_INIT_SIZE) * dsize,
+                               GFP_ATOMIC);
                         if (!tmp)
-                               /* Still try to delete expired elements */
+                               /* Still try to delete expired elements. */
                                 continue;
                         tmp->size = n->size - AHASH_INIT_SIZE;
                         for (j = 0, d = 0; j < n->pos; j++) {
                                 if (!test_bit(j, n->used))
                                         continue;
                                 data = ahash_data(n, j, dsize);
-                               memcpy(tmp->value + d * dsize, data, dsize);
+                               memcpy(tmp->value + d * dsize,
+                                      data, dsize);
                                 set_bit(d, tmp->used);
                                 d++;
                         }
                         tmp->pos = d;
-                       set->ext_size -= ext_size(AHASH_INIT_SIZE, dsize);
+                       t->hregion[r].ext_size -=
+                               ext_size(AHASH_INIT_SIZE, dsize);
                         rcu_assign_pointer(hbucket(t, i), tmp);
                         kfree_rcu(n, rcu);
                 }
         }
+       spin_unlock_bh(&t->hregion[r].lock);
  }
  
  static void
-mtype_gc(struct timer_list *t)
+mtype_gc(struct work_struct *work)
  {
-       struct htype *h = from_timer(h, t, gc);
-       struct ip_set *set = h->set;
+       struct htable_gc *gc;
+       struct ip_set *set;
+       struct htype *h;
+       struct htable *t;
+       u32 r, numof_locks;
+       unsigned int next_run;
+
+       gc = container_of(work, struct htable_gc, dwork.work);
+       set = gc->set;
+       h = set->data;
  
-       pr_debug("called\n");
         spin_lock_bh(&set->lock);
-       mtype_expire(set, h);
+       t = ipset_dereference_set(h->table, set);
+       atomic_inc(&t->uref);
+       numof_locks = ahash_numof_locks(t->htable_bits);
+       r = gc->region++;
+       if (r >= numof_locks) {
+               r = gc->region = 0;
+       }
+       next_run = (IPSET_GC_PERIOD(set->timeout) * HZ) / numof_locks;
+       if (next_run < HZ/10)
+               next_run = HZ/10;
         spin_unlock_bh(&set->lock);
  
-       h->gc.expires = jiffies + IPSET_GC_PERIOD(set->timeout) * HZ;
-       add_timer(&h->gc);
+       mtype_gc_do(set, h, t, r);
+
+       if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) {
+               pr_debug("Table destroy after resize by expire: %p\n", t);
+               mtype_ahash_destroy(set, t, false);
+       }
+
+       queue_delayed_work(system_power_efficient_wq, &gc->dwork, next_run);
+
+}
+
+static void
+mtype_gc_init(struct htable_gc *gc)
+{
+       INIT_DEFERRABLE_WORK(&gc->dwork, mtype_gc);
+       queue_delayed_work(system_power_efficient_wq, &gc->dwork, HZ);
  }
  
+static int
+mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
+         struct ip_set_ext *mext, u32 flags);
+static int
+mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
+         struct ip_set_ext *mext, u32 flags);
+
  /* Resize a hash: create a new hash table with doubling the hashsize
   * and inserting the elements to it. Repeat until we succeed or
   * fail due to memory pressures.
@@ -547,7 +644,7 @@ mtype_resize(struct ip_set *set, bool retried)
         struct htype *h = set->data;
         struct htable *t, *orig;
         u8 htable_bits;
-       size_t extsize, dsize = set->dsize;
+       size_t dsize = set->dsize;
  #ifdef IP_SET_HASH_WITH_NETS
         u8 flags;
         struct mtype_elem *tmp;
@@ -555,7 +652,9 @@ mtype_resize(struct ip_set *set, bool retried)
         struct mtype_elem *data;
         struct mtype_elem *d;
         struct hbucket *n, *m;
-       u32 i, j, key;
+       struct list_head *l, *lt;
+       struct mtype_resize_ad *x;
+       u32 i, j, r, nr, key;
         int ret;
  
  #ifdef IP_SET_HASH_WITH_NETS
@@ -563,10 +662,8 @@ mtype_resize(struct ip_set *set, bool retried)
         if (!tmp)
                 return -ENOMEM;
  #endif
-       rcu_read_lock_bh();
-       orig = rcu_dereference_bh_nfnl(h->table);
+       orig = ipset_dereference_bh_nfnl(h->table);
         htable_bits = orig->htable_bits;
-       rcu_read_unlock_bh();
  
  retry:
         ret = 0;
@@ -583,88 +680,124 @@ retry:
                 ret = -ENOMEM;
                 goto out;
         }
+       t->hregion = ip_set_alloc(ahash_sizeof_regions(htable_bits));
+       if (!t->hregion) {
+               kfree(t);
+               ret = -ENOMEM;
+               goto out;
+       }
         t->htable_bits = htable_bits;
+       t->maxelem = h->maxelem / ahash_numof_locks(htable_bits);
+       for (i = 0; i < ahash_numof_locks(htable_bits); i++)
+               spin_lock_init(&t->hregion[i].lock);
  
-       spin_lock_bh(&set->lock);
-       orig = __ipset_dereference_protected(h->table, 1);
-       /* There can't be another parallel resizing, but dumping is possible */
+       /* There can't be another parallel resizing,
+        * but dumping, gc, kernel side add/del are possible
+        */
+       orig = ipset_dereference_bh_nfnl(h->table);
         atomic_set(&orig->ref, 1);
         atomic_inc(&orig->uref);
-       extsize = 0;
         pr_debug("attempt to resize set %s from %u to %u, t %p\n",
                  set->name, orig->htable_bits, htable_bits, orig);
-       for (i = 0; i < jhash_size(orig->htable_bits); i++) {
-               n = __ipset_dereference_protected(hbucket(orig, i), 1);
-               if (!n)
-                       continue;
-               for (j = 0; j < n->pos; j++) {
-                       if (!test_bit(j, n->used))
+       for (r = 0; r < ahash_numof_locks(orig->htable_bits); r++) {
+               /* Expire may replace a hbucket with another one */
+               rcu_read_lock_bh();
+               for (i = ahash_bucket_start(r, orig->htable_bits);
+                    i < ahash_bucket_end(r, orig->htable_bits); i++) {
+                       n = __ipset_dereference(hbucket(orig, i));
+                       if (!n)
                                 continue;
-                       data = ahash_data(n, j, dsize);
+                       for (j = 0; j < n->pos; j++) {
+                               if (!test_bit(j, n->used))
+                                       continue;
+                               data = ahash_data(n, j, dsize);
+                               if (SET_ELEM_EXPIRED(set, data))
+                                       continue;
  #ifdef IP_SET_HASH_WITH_NETS
-                       /* We have readers running parallel with us,
-                        * so the live data cannot be modified.
-                        */
-                       flags = 0;
-                       memcpy(tmp, data, dsize);
-                       data = tmp;
-                       mtype_data_reset_flags(data, &flags);
+                               /* We have readers running parallel with us,
+                                * so the live data cannot be modified.
+                                */
+                               flags = 0;
+                               memcpy(tmp, data, dsize);
+                               data = tmp;
+                               mtype_data_reset_flags(data, &flags);
  #endif
-                       key = HKEY(data, h->initval, htable_bits);
-                       m = __ipset_dereference_protected(hbucket(t, key), 1);
-                       if (!m) {
-                               m = kzalloc(sizeof(*m) +
+                               key = HKEY(data, h->initval, htable_bits);
+                               m = __ipset_dereference(hbucket(t, key));
+                               nr = ahash_region(key, htable_bits);
+                               if (!m) {
+                                       m = kzalloc(sizeof(*m) +
                                             AHASH_INIT_SIZE * dsize,
                                             GFP_ATOMIC);
-                               if (!m) {
-                                       ret = -ENOMEM;
-                                       goto cleanup;
-                               }
-                               m->size = AHASH_INIT_SIZE;
-                               extsize += ext_size(AHASH_INIT_SIZE, dsize);
-                               RCU_INIT_POINTER(hbucket(t, key), m);
-                       } else if (m->pos >= m->size) {
-                               struct hbucket *ht;
-
-                               if (m->size >= AHASH_MAX(h)) {
-                                       ret = -EAGAIN;
-                               } else {
-                                       ht = kzalloc(sizeof(*ht) +
+                                       if (!m) {
+                                               ret = -ENOMEM;
+                                               goto cleanup;
+                                       }
+                                       m->size = AHASH_INIT_SIZE;
+                                       t->hregion[nr].ext_size +=
+                                               ext_size(AHASH_INIT_SIZE,
+                                                        dsize);
+                                       RCU_INIT_POINTER(hbucket(t, key), m);
+                               } else if (m->pos >= m->size) {
+                                       struct hbucket *ht;
+
+                                       if (m->size >= AHASH_MAX(h)) {
+                                               ret = -EAGAIN;
+                                       } else {
+                                               ht = kzalloc(sizeof(*ht) +
                                                 (m->size + AHASH_INIT_SIZE)
                                                 * dsize,
                                                 GFP_ATOMIC);
-                                       if (!ht)
-                                               ret = -ENOMEM;
+                                               if (!ht)
+                                                       ret = -ENOMEM;
+                                       }
+                                       if (ret < 0)
+                                               goto cleanup;
+                                       memcpy(ht, m, sizeof(struct hbucket) +
+                                              m->size * dsize);
+                                       ht->size = m->size + AHASH_INIT_SIZE;
+                                       t->hregion[nr].ext_size +=
+                                               ext_size(AHASH_INIT_SIZE,
+                                                        dsize);
+                                       kfree(m);
+                                       m = ht;
+                                       RCU_INIT_POINTER(hbucket(t, key), ht);
                                 }
-                               if (ret < 0)
-                                       goto cleanup;
-                               memcpy(ht, m, sizeof(struct hbucket) +
-                                             m->size * dsize);
-                               ht->size = m->size + AHASH_INIT_SIZE;
-                               extsize += ext_size(AHASH_INIT_SIZE, dsize);
-                               kfree(m);
-                               m = ht;
-                               RCU_INIT_POINTER(hbucket(t, key), ht);
-                       }
-                       d = ahash_data(m, m->pos, dsize);
-                       memcpy(d, data, dsize);
-                       set_bit(m->pos++, m->used);
+                               d = ahash_data(m, m->pos, dsize);
+                               memcpy(d, data, dsize);
+                               set_bit(m->pos++, m->used);
+                               t->hregion[nr].elements++;
  #ifdef IP_SET_HASH_WITH_NETS
-                       mtype_data_reset_flags(d, &flags);
+                               mtype_data_reset_flags(d, &flags);
  #endif
+                       }
                 }
+               rcu_read_unlock_bh();
         }
-       rcu_assign_pointer(h->table, t);
-       set->ext_size = extsize;
  
-       spin_unlock_bh(&set->lock);
+       /* There can't be any other writer. */
+       rcu_assign_pointer(h->table, t);
  
         /* Give time to other readers of the set */
         synchronize_rcu();
  
         pr_debug("set %s resized from %u (%p) to %u (%p)\n", set->name,
                  orig->htable_bits, orig, t->htable_bits, t);
-       /* If there's nobody else dumping the table, destroy it */
+       /* Add/delete elements processed by the SET target during resize.
+        * Kernel-side add cannot trigger a resize and userspace actions
+        * are serialized by the mutex.
+        */
+       list_for_each_safe(l, lt, &h->ad) {
+               x = list_entry(l, struct mtype_resize_ad, list);
+               if (x->ad == IPSET_ADD) {
+                       mtype_add(set, &x->d, &x->ext, &x->mext, x->flags);
+               } else {
+                       mtype_del(set, &x->d, NULL, NULL, 0);
+               }
+               list_del(l);
+               kfree(l);
+       }
+       /* If there's nobody else using the table, destroy it */
         if (atomic_dec_and_test(&orig->uref)) {
                 pr_debug("Table destroy by resize %p\n", orig);
                 mtype_ahash_destroy(set, orig, false);
@@ -677,15 +810,44 @@ out:
         return ret;
  
  cleanup:
+       rcu_read_unlock_bh();
         atomic_set(&orig->ref, 0);
         atomic_dec(&orig->uref);
-       spin_unlock_bh(&set->lock);
         mtype_ahash_destroy(set, t, false);
         if (ret == -EAGAIN)
                 goto retry;
         goto out;
  }
  
+/* Get the current number of elements and ext_size in the set  */
+static void
+mtype_ext_size(struct ip_set *set, u32 *elements, size_t *ext_size)
+{
+       struct htype *h = set->data;
+       const struct htable *t;
+       u32 i, j, r;
+       struct hbucket *n;
+       struct mtype_elem *data;
+
+       t = rcu_dereference_bh(h->table);
+       for (r = 0; r < ahash_numof_locks(t->htable_bits); r++) {
+               for (i = ahash_bucket_start(r, t->htable_bits);
+                    i < ahash_bucket_end(r, t->htable_bits); i++) {
+                       n = rcu_dereference_bh(hbucket(t, i));
+                       if (!n)
+                               continue;
+                       for (j = 0; j < n->pos; j++) {
+                               if (!test_bit(j, n->used))
+                                       continue;
+                               data = ahash_data(n, j, set->dsize);
+                               if (!SET_ELEM_EXPIRED(set, data))
+                                       (*elements)++;
+                       }
+               }
+               *ext_size += t->hregion[r].ext_size;
+       }
+}
+
  /* Add an element to a hash and update the internal counters when succeeded,
   * otherwise report the proper error code.
   */
@@ -698,32 +860,49 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
         const struct mtype_elem *d = value;
         struct mtype_elem *data;
         struct hbucket *n, *old = ERR_PTR(-ENOENT);
-       int i, j = -1;
+       int i, j = -1, ret;
         bool flag_exist = flags & IPSET_FLAG_EXIST;
         bool deleted = false, forceadd = false, reuse = false;
-       u32 key, multi = 0;
+       u32 r, key, multi = 0, elements, maxelem;
  
-       if (set->elements >= h->maxelem) {
-               if (SET_WITH_TIMEOUT(set))
-                       /* FIXME: when set is full, we slow down here */
-                       mtype_expire(set, h);
-               if (set->elements >= h->maxelem && SET_WITH_FORCEADD(set))
+       rcu_read_lock_bh();
+       t = rcu_dereference_bh(h->table);
+       key = HKEY(value, h->initval, t->htable_bits);
+       r = ahash_region(key, t->htable_bits);
+       atomic_inc(&t->uref);
+       elements = t->hregion[r].elements;
+       maxelem = t->maxelem;
+       if (elements >= maxelem) {
+               u32 e;
+               if (SET_WITH_TIMEOUT(set)) {
+                       rcu_read_unlock_bh();
+                       mtype_gc_do(set, h, t, r);
+                       rcu_read_lock_bh();
+               }
+               maxelem = h->maxelem;
+               elements = 0;
+               for (e = 0; e < ahash_numof_locks(t->htable_bits); e++)
+                       elements += t->hregion[e].elements;
+               if (elements >= maxelem && SET_WITH_FORCEADD(set))
                         forceadd = true;
         }
+       rcu_read_unlock_bh();
  
-       t = ipset_dereference_protected(h->table, set);
-       key = HKEY(value, h->initval, t->htable_bits);
-       n = __ipset_dereference_protected(hbucket(t, key), 1);
+       spin_lock_bh(&t->hregion[r].lock);
+       n = rcu_dereference_bh(hbucket(t, key));
         if (!n) {
-               if (forceadd || set->elements >= h->maxelem)
+               if (forceadd || elements >= maxelem)
                         goto set_full;
                 old = NULL;
                 n = kzalloc(sizeof(*n) + AHASH_INIT_SIZE * set->dsize,
                             GFP_ATOMIC);
-               if (!n)
-                       return -ENOMEM;
+               if (!n) {
+                       ret = -ENOMEM;
+                       goto unlock;
+               }
                 n->size = AHASH_INIT_SIZE;
-               set->ext_size += ext_size(AHASH_INIT_SIZE, set->dsize);
+               t->hregion[r].ext_size +=
+                       ext_size(AHASH_INIT_SIZE, set->dsize);
                 goto copy_elem;
         }
         for (i = 0; i < n->pos; i++) {
@@ -737,38 +916,37 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
                 }
                 data = ahash_data(n, i, set->dsize);
                 if (mtype_data_equal(data, d, &multi)) {
-                       if (flag_exist ||
-                           (SET_WITH_TIMEOUT(set) &&
-                            ip_set_timeout_expired(ext_timeout(data, set)))) {
+                       if (flag_exist || SET_ELEM_EXPIRED(set, data)) {
                                 /* Just the extensions could be overwritten */
                                 j = i;
                                 goto overwrite_extensions;
                         }
-                       return -IPSET_ERR_EXIST;
+                       ret = -IPSET_ERR_EXIST;
+                       goto unlock;
                 }
                 /* Reuse first timed out entry */
-               if (SET_WITH_TIMEOUT(set) &&
-                   ip_set_timeout_expired(ext_timeout(data, set)) &&
-                   j == -1) {
+               if (SET_ELEM_EXPIRED(set, data) && j == -1) {
                         j = i;
                         reuse = true;
                 }
         }
         if (reuse || forceadd) {
+               if (j == -1)
+                       j = 0;
                 data = ahash_data(n, j, set->dsize);
                 if (!deleted) {
  #ifdef IP_SET_HASH_WITH_NETS
                         for (i = 0; i < IPSET_NET_COUNT; i++)
-                               mtype_del_cidr(h,
+                               mtype_del_cidr(set, h,
                                         NCIDR_PUT(DCIDR_GET(data->cidr, i)),
                                         i);
  #endif
                         ip_set_ext_destroy(set, data);
-                       set->elements--;
+                       t->hregion[r].elements--;
                 }
                 goto copy_data;
         }
-       if (set->elements >= h->maxelem)
+       if (elements >= maxelem)
                 goto set_full;
         /* Create a new slot */
         if (n->pos >= n->size) {
@@ -776,28 +954,32 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
                 if (n->size >= AHASH_MAX(h)) {
                         /* Trigger rehashing */
                         mtype_data_next(&h->next, d);
-                       return -EAGAIN;
+                       ret = -EAGAIN;
+                       goto resize;
                 }
                 old = n;
                 n = kzalloc(sizeof(*n) +
                             (old->size + AHASH_INIT_SIZE) * set->dsize,
                             GFP_ATOMIC);
-               if (!n)
-                       return -ENOMEM;
+               if (!n) {
+                       ret = -ENOMEM;
+                       goto unlock;
+               }
                 memcpy(n, old, sizeof(struct hbucket) +
                        old->size * set->dsize);
                 n->size = old->size + AHASH_INIT_SIZE;
-               set->ext_size += ext_size(AHASH_INIT_SIZE, set->dsize);
+               t->hregion[r].ext_size +=
+                       ext_size(AHASH_INIT_SIZE, set->dsize);
         }
  
  copy_elem:
         j = n->pos++;
         data = ahash_data(n, j, set->dsize);
  copy_data:
-       set->elements++;
+       t->hregion[r].elements++;
  #ifdef IP_SET_HASH_WITH_NETS
         for (i = 0; i < IPSET_NET_COUNT; i++)
-               mtype_add_cidr(h, NCIDR_PUT(DCIDR_GET(d->cidr, i)), i);
+               mtype_add_cidr(set, h, NCIDR_PUT(DCIDR_GET(d->cidr, i)), i);
  #endif
         memcpy(data, d, sizeof(struct mtype_elem));
  overwrite_extensions:
@@ -820,13 +1002,41 @@ overwrite_extensions:
                 if (old)
                         kfree_rcu(old, rcu);
         }
+       ret = 0;
+resize:
+       spin_unlock_bh(&t->hregion[r].lock);
+       if (atomic_read(&t->ref) && ext->target) {
+               /* Resize is in process and kernel side add, save values */
+               struct mtype_resize_ad *x;
+
+               x = kzalloc(sizeof(struct mtype_resize_ad), GFP_ATOMIC);
+               if (!x)
+                       /* Don't bother */
+                       goto out;
+               x->ad = IPSET_ADD;
+               memcpy(&x->d, value, sizeof(struct mtype_elem));
+               memcpy(&x->ext, ext, sizeof(struct ip_set_ext));
+               memcpy(&x->mext, mext, sizeof(struct ip_set_ext));
+               x->flags = flags;
+               spin_lock_bh(&set->lock);
+               list_add_tail(&x->list, &h->ad);
+               spin_unlock_bh(&set->lock);
+       }
+       goto out;
  
-       return 0;
  set_full:
         if (net_ratelimit())
                 pr_warn("Set %s is full, maxelem %u reached\n",
-                       set->name, h->maxelem);
-       return -IPSET_ERR_HASH_FULL;
+                       set->name, maxelem);
+       ret = -IPSET_ERR_HASH_FULL;
+unlock:
+       spin_unlock_bh(&t->hregion[r].lock);
+out:
+       if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) {
+               pr_debug("Table destroy after resize by add: %p\n", t);
+               mtype_ahash_destroy(set, t, false);
+       }
+       return ret;
  }
  
  /* Delete an element from the hash and free up space if possible.
@@ -840,13 +1050,23 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
         const struct mtype_elem *d = value;
         struct mtype_elem *data;
         struct hbucket *n;
-       int i, j, k, ret = -IPSET_ERR_EXIST;
+       struct mtype_resize_ad *x = NULL;
+       int i, j, k, r, ret = -IPSET_ERR_EXIST;
         u32 key, multi = 0;
         size_t dsize = set->dsize;
  
-       t = ipset_dereference_protected(h->table, set);
+       /* Userspace add and resize is excluded by the mutex.
+        * Kernespace add does not trigger resize.
+        */
+       rcu_read_lock_bh();
+       t = rcu_dereference_bh(h->table);
         key = HKEY(value, h->initval, t->htable_bits);
-       n = __ipset_dereference_protected(hbucket(t, key), 1);
+       r = ahash_region(key, t->htable_bits);
+       atomic_inc(&t->uref);
+       rcu_read_unlock_bh();
+
+       spin_lock_bh(&t->hregion[r].lock);
+       n = rcu_dereference_bh(hbucket(t, key));
         if (!n)
                 goto out;
         for (i = 0, k = 0; i < n->pos; i++) {
@@ -857,8 +1077,7 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
                 data = ahash_data(n, i, dsize);
                 if (!mtype_data_equal(data, d, &multi))
                         continue;
-               if (SET_WITH_TIMEOUT(set) &&
-                   ip_set_timeout_expired(ext_timeout(data, set)))
+               if (SET_ELEM_EXPIRED(set, data))
                         goto out;
  
                 ret = 0;
@@ -866,20 +1085,33 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
                 smp_mb__after_atomic();
                 if (i + 1 == n->pos)
                         n->pos--;
-               set->elements--;
+               t->hregion[r].elements--;
  #ifdef IP_SET_HASH_WITH_NETS
                 for (j = 0; j < IPSET_NET_COUNT; j++)
-                       mtype_del_cidr(h, NCIDR_PUT(DCIDR_GET(d->cidr, j)),
-                                      j);
+                       mtype_del_cidr(set, h,
+                                      NCIDR_PUT(DCIDR_GET(d->cidr, j)), j);
  #endif
                 ip_set_ext_destroy(set, data);
  
+               if (atomic_read(&t->ref) && ext->target) {
+                       /* Resize is in process and kernel side del,
+                        * save values
+                        */
+                       x = kzalloc(sizeof(struct mtype_resize_ad),
+                                   GFP_ATOMIC);
+                       if (x) {
+                               x->ad = IPSET_DEL;
+                               memcpy(&x->d, value,
+                                      sizeof(struct mtype_elem));
+                               x->flags = flags;
+                       }
+               }
                 for (; i < n->pos; i++) {
                         if (!test_bit(i, n->used))
                                 k++;
                 }
                 if (n->pos == 0 && k == 0) {
-                       set->ext_size -= ext_size(n->size, dsize);
+                       t->hregion[r].ext_size -= ext_size(n->size, dsize);
                         rcu_assign_pointer(hbucket(t, key), NULL);
                         kfree_rcu(n, rcu);
                 } else if (k >= AHASH_INIT_SIZE) {
@@ -898,7 +1130,8 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
                                 k++;
                         }
                         tmp->pos = k;
-                       set->ext_size -= ext_size(AHASH_INIT_SIZE, dsize);
+                       t->hregion[r].ext_size -=
+                               ext_size(AHASH_INIT_SIZE, dsize);
                         rcu_assign_pointer(hbucket(t, key), tmp);
                         kfree_rcu(n, rcu);
                 }
@@ -906,6 +1139,16 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
         }
  
  out:
+       spin_unlock_bh(&t->hregion[r].lock);
+       if (x) {
+               spin_lock_bh(&set->lock);
+               list_add(&x->list, &h->ad);
+               spin_unlock_bh(&set->lock);
+       }
+       if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) {
+               pr_debug("Table destroy after resize by del: %p\n", t);
+               mtype_ahash_destroy(set, t, false);
+       }
         return ret;
  }
  
@@ -991,6 +1234,7 @@ mtype_test(struct ip_set *set, void *value, const struct ip_set_ext *ext,
         int i, ret = 0;
         u32 key, multi = 0;
  
+       rcu_read_lock_bh();
         t = rcu_dereference_bh(h->table);
  #ifdef IP_SET_HASH_WITH_NETS
         /* If we test an IP address and not a network address,
@@ -1022,6 +1266,7 @@ mtype_test(struct ip_set *set, void *value, const struct ip_set_ext *ext,
                         goto out;
         }
  out:
+       rcu_read_unlock_bh();
         return ret;
  }
  
@@ -1033,23 +1278,14 @@ mtype_head(struct ip_set *set, struct sk_buff *skb)
         const struct htable *t;
         struct nlattr *nested;
         size_t memsize;
+       u32 elements = 0;
+       size_t ext_size = 0;
         u8 htable_bits;
  
-       /* If any members have expired, set->elements will be wrong
-        * mytype_expire function will update it with the right count.
-        * we do not hold set->lock here, so grab it first.
-        * set->elements can still be incorrect in the case of a huge set,
-        * because elements might time out during the listing.
-        */
-       if (SET_WITH_TIMEOUT(set)) {
-               spin_lock_bh(&set->lock);
-               mtype_expire(set, h);
-               spin_unlock_bh(&set->lock);
-       }
-
         rcu_read_lock_bh();
-       t = rcu_dereference_bh_nfnl(h->table);
-       memsize = mtype_ahash_memsize(h, t) + set->ext_size;
+       t = rcu_dereference_bh(h->table);
+       mtype_ext_size(set, &elements, &ext_size);
+       memsize = mtype_ahash_memsize(h, t) + ext_size + set->ext_size;
         htable_bits = t->htable_bits;
         rcu_read_unlock_bh();
  
@@ -1071,7 +1307,7 @@ mtype_head(struct ip_set *set, struct sk_buff *skb)
  #endif
         if (nla_put_net32(skb, IPSET_ATTR_REFERENCES, htonl(set->ref)) ||
             nla_put_net32(skb, IPSET_ATTR_MEMSIZE, htonl(memsize)) ||
-           nla_put_net32(skb, IPSET_ATTR_ELEMENTS, htonl(set->elements)))
+           nla_put_net32(skb, IPSET_ATTR_ELEMENTS, htonl(elements)))
                 goto nla_put_failure;
         if (unlikely(ip_set_put_flags(skb, set)))
                 goto nla_put_failure;
@@ -1091,15 +1327,15 @@ mtype_uref(struct ip_set *set, struct netlink_callback *cb, bool start)
  
         if (start) {
                 rcu_read_lock_bh();
-               t = rcu_dereference_bh_nfnl(h->table);
+               t = ipset_dereference_bh_nfnl(h->table);
                 atomic_inc(&t->uref);
                 cb->args[IPSET_CB_PRIVATE] = (unsigned long)t;
                 rcu_read_unlock_bh();
         } else if (cb->args[IPSET_CB_PRIVATE]) {
                 t = (struct htable *)cb->args[IPSET_CB_PRIVATE];
                 if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) {
-                       /* Resizing didn't destroy the hash table */
-                       pr_debug("Table destroy by dump: %p\n", t);
+                       pr_debug("Table destroy after resize "
+                                " by dump: %p\n", t);
                         mtype_ahash_destroy(set, t, false);
                 }
                 cb->args[IPSET_CB_PRIVATE] = 0;
@@ -1141,8 +1377,7 @@ mtype_list(const struct ip_set *set,
                         if (!test_bit(i, n->used))
                                 continue;
                         e = ahash_data(n, i, set->dsize);
-                       if (SET_WITH_TIMEOUT(set) &&
-                           ip_set_timeout_expired(ext_timeout(e, set)))
+                       if (SET_ELEM_EXPIRED(set, e))
                                 continue;
                         pr_debug("list hash %lu hbucket %p i %u, data %p\n",
                                  cb->args[IPSET_CB_ARG0], n, i, e);
@@ -1208,6 +1443,7 @@ static const struct ip_set_type_variant mtype_variant = {
         .uref   = mtype_uref,
         .resize = mtype_resize,
         .same_set = mtype_same_set,
+       .region_lock = true,
  };
  
  #ifdef IP_SET_EMIT_CREATE
@@ -1226,6 +1462,7 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set,
         size_t hsize;
         struct htype *h;
         struct htable *t;
+       u32 i;
  
         pr_debug("Create set %s with family %s\n",
                  set->name, set->family == NFPROTO_IPV4 ? "inet" : "inet6");
@@ -1294,6 +1531,15 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set,
                 kfree(h);
                 return -ENOMEM;
         }
+       t->hregion = ip_set_alloc(ahash_sizeof_regions(hbits));
+       if (!t->hregion) {
+               kfree(t);
+               kfree(h);
+               return -ENOMEM;
+       }
+       h->gc.set = set;
+       for (i = 0; i < ahash_numof_locks(hbits); i++)
+               spin_lock_init(&t->hregion[i].lock);
         h->maxelem = maxelem;
  #ifdef IP_SET_HASH_WITH_NETMASK
         h->netmask = netmask;
@@ -1304,9 +1550,10 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set,
         get_random_bytes(&h->initval, sizeof(h->initval));
  
         t->htable_bits = hbits;
+       t->maxelem = h->maxelem / ahash_numof_locks(hbits);
         RCU_INIT_POINTER(h->table, t);
  
-       h->set = set;
+       INIT_LIST_HEAD(&h->ad);
         set->data = h;
  #ifndef IP_SET_PROTO_UNDEF
         if (set->family == NFPROTO_IPV4) {
@@ -1329,12 +1576,10 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set,
  #ifndef IP_SET_PROTO_UNDEF
                 if (set->family == NFPROTO_IPV4)
  #endif
-                       IPSET_TOKEN(HTYPE, 4_gc_init)(set,
-                               IPSET_TOKEN(HTYPE, 4_gc));
+                       IPSET_TOKEN(HTYPE, 4_gc_init)(&h->gc);
  #ifndef IP_SET_PROTO_UNDEF
                 else
-                       IPSET_TOKEN(HTYPE, 6_gc_init)(set,
-                               IPSET_TOKEN(HTYPE, 6_gc));
+                       IPSET_TOKEN(HTYPE, 6_gc_init)(&h->gc);
  #endif
         }
         pr_debug("create %s hashsize %u (%u) maxelem %u: %p(%p)\n",
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c

index d1305423640f3abfcaf7e3295022c765d31d82e5..1927fc296f9514bcd5866d340c6f659bea0fdb3e 100644 (file)
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -894,32 +894,175 @@ static void nf_ct_acct_merge(struct nf_conn *ct, enum ip_conntrack_info ctinfo,
         }
  }
  
-/* Resolve race on insertion if this protocol allows this. */
+static void __nf_conntrack_insert_prepare(struct nf_conn *ct)
+{
+       struct nf_conn_tstamp *tstamp;
+
+       atomic_inc(&ct->ct_general.use);
+       ct->status |= IPS_CONFIRMED;
+
+       /* set conntrack timestamp, if enabled. */
+       tstamp = nf_conn_tstamp_find(ct);
+       if (tstamp)
+               tstamp->start = ktime_get_real_ns();
+}
+
+static int __nf_ct_resolve_clash(struct sk_buff *skb,
+                                struct nf_conntrack_tuple_hash *h)
+{
+       /* This is the conntrack entry already in hashes that won race. */
+       struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
+       enum ip_conntrack_info ctinfo;
+       struct nf_conn *loser_ct;
+
+       loser_ct = nf_ct_get(skb, &ctinfo);
+
+       if (nf_ct_is_dying(ct))
+               return NF_DROP;
+
+       if (!atomic_inc_not_zero(&ct->ct_general.use))
+               return NF_DROP;
+
+       if (((ct->status & IPS_NAT_DONE_MASK) == 0) ||
+           nf_ct_match(ct, loser_ct)) {
+               struct net *net = nf_ct_net(ct);
+
+               nf_ct_acct_merge(ct, ctinfo, loser_ct);
+               nf_ct_add_to_dying_list(loser_ct);
+               nf_conntrack_put(&loser_ct->ct_general);
+               nf_ct_set(skb, ct, ctinfo);
+
+               NF_CT_STAT_INC(net, insert_failed);
+               return NF_ACCEPT;
+       }
+
+       nf_ct_put(ct);
+       return NF_DROP;
+}
+
+/**
+ * nf_ct_resolve_clash_harder - attempt to insert clashing conntrack entry
+ *
+ * @skb: skb that causes the collision
+ * @repl_idx: hash slot for reply direction
+ *
+ * Called when origin or reply direction had a clash.
+ * The skb can be handled without packet drop provided the reply direction
+ * is unique or there the existing entry has the identical tuple in both
+ * directions.
+ *
+ * Caller must hold conntrack table locks to prevent concurrent updates.
+ *
+ * Returns NF_DROP if the clash could not be handled.
+ */
+static int nf_ct_resolve_clash_harder(struct sk_buff *skb, u32 repl_idx)
+{
+       struct nf_conn *loser_ct = (struct nf_conn *)skb_nfct(skb);
+       const struct nf_conntrack_zone *zone;
+       struct nf_conntrack_tuple_hash *h;
+       struct hlist_nulls_node *n;
+       struct net *net;
+
+       zone = nf_ct_zone(loser_ct);
+       net = nf_ct_net(loser_ct);
+
+       /* Reply direction must never result in a clash, unless both origin
+        * and reply tuples are identical.
+        */
+       hlist_nulls_for_each_entry(h, n, &nf_conntrack_hash[repl_idx], hnnode) {
+               if (nf_ct_key_equal(h,
+                                   &loser_ct->tuplehash[IP_CT_DIR_REPLY].tuple,
+                                   zone, net))
+                       return __nf_ct_resolve_clash(skb, h);
+       }
+
+       /* We want the clashing entry to go away real soon: 1 second timeout. */
+       loser_ct->timeout = nfct_time_stamp + HZ;
+
+       /* IPS_NAT_CLASH removes the entry automatically on the first
+        * reply.  Also prevents UDP tracker from moving the entry to
+        * ASSURED state, i.e. the entry can always be evicted under
+        * pressure.
+        */
+       loser_ct->status |= IPS_FIXED_TIMEOUT | IPS_NAT_CLASH;
+
+       __nf_conntrack_insert_prepare(loser_ct);
+
+       /* fake add for ORIGINAL dir: we want lookups to only find the entry
+        * already in the table.  This also hides the clashing entry from
+        * ctnetlink iteration, i.e. conntrack -L won't show them.
+        */
+       hlist_nulls_add_fake(&loser_ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
+
+       hlist_nulls_add_head_rcu(&loser_ct->tuplehash[IP_CT_DIR_REPLY].hnnode,
+                                &nf_conntrack_hash[repl_idx]);
+       return NF_ACCEPT;
+}
+
+/**
+ * nf_ct_resolve_clash - attempt to handle clash without packet drop
+ *
+ * @skb: skb that causes the clash
+ * @h: tuplehash of the clashing entry already in table
+ * @hash_reply: hash slot for reply direction
+ *
+ * A conntrack entry can be inserted to the connection tracking table
+ * if there is no existing entry with an identical tuple.
+ *
+ * If there is one, @skb (and the assocated, unconfirmed conntrack) has
+ * to be dropped.  In case @skb is retransmitted, next conntrack lookup
+ * will find the already-existing entry.
+ *
+ * The major problem with such packet drop is the extra delay added by
+ * the packet loss -- it will take some time for a retransmit to occur
+ * (or the sender to time out when waiting for a reply).
+ *
+ * This function attempts to handle the situation without packet drop.
+ *
+ * If @skb has no NAT transformation or if the colliding entries are
+ * exactly the same, only the to-be-confirmed conntrack entry is discarded
+ * and @skb is associated with the conntrack entry already in the table.
+ *
+ * Failing that, the new, unconfirmed conntrack is still added to the table
+ * provided that the collision only occurs in the ORIGINAL direction.
+ * The new entry will be added after the existing one in the hash list,
+ * so packets in the ORIGINAL direction will continue to match the existing
+ * entry.  The new entry will also have a fixed timeout so it expires --
+ * due to the collision, it will not see bidirectional traffic.
+ *
+ * Returns NF_DROP if the clash could not be resolved.
+ */
  static __cold noinline int
-nf_ct_resolve_clash(struct net *net, struct sk_buff *skb,
-                   enum ip_conntrack_info ctinfo,
-                   struct nf_conntrack_tuple_hash *h)
+nf_ct_resolve_clash(struct sk_buff *skb, struct nf_conntrack_tuple_hash *h,
+                   u32 reply_hash)
  {
         /* This is the conntrack entry already in hashes that won race. */
         struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
         const struct nf_conntrack_l4proto *l4proto;
-       enum ip_conntrack_info oldinfo;
-       struct nf_conn *loser_ct = nf_ct_get(skb, &oldinfo);
+       enum ip_conntrack_info ctinfo;
+       struct nf_conn *loser_ct;
+       struct net *net;
+       int ret;
+
+       loser_ct = nf_ct_get(skb, &ctinfo);
+       net = nf_ct_net(loser_ct);
  
         l4proto = nf_ct_l4proto_find(nf_ct_protonum(ct));
-       if (l4proto->allow_clash &&
-           !nf_ct_is_dying(ct) &&
-           atomic_inc_not_zero(&ct->ct_general.use)) {
-               if (((ct->status & IPS_NAT_DONE_MASK) == 0) ||
-                   nf_ct_match(ct, loser_ct)) {
-                       nf_ct_acct_merge(ct, ctinfo, loser_ct);
-                       nf_conntrack_put(&loser_ct->ct_general);
-                       nf_ct_set(skb, ct, oldinfo);
-                       return NF_ACCEPT;
-               }
-               nf_ct_put(ct);
-       }
+       if (!l4proto->allow_clash)
+               goto drop;
+
+       ret = __nf_ct_resolve_clash(skb, h);
+       if (ret == NF_ACCEPT)
+               return ret;
+
+       ret = nf_ct_resolve_clash_harder(skb, reply_hash);
+       if (ret == NF_ACCEPT)
+               return ret;
+
+drop:
+       nf_ct_add_to_dying_list(loser_ct);
         NF_CT_STAT_INC(net, drop);
+       NF_CT_STAT_INC(net, insert_failed);
         return NF_DROP;
  }
  
@@ -932,7 +1075,6 @@ __nf_conntrack_confirm(struct sk_buff *skb)
         struct nf_conntrack_tuple_hash *h;
         struct nf_conn *ct;
         struct nf_conn_help *help;
-       struct nf_conn_tstamp *tstamp;
         struct hlist_nulls_node *n;
         enum ip_conntrack_info ctinfo;
         struct net *net;
@@ -989,6 +1131,7 @@ __nf_conntrack_confirm(struct sk_buff *skb)
  
         if (unlikely(nf_ct_is_dying(ct))) {
                 nf_ct_add_to_dying_list(ct);
+               NF_CT_STAT_INC(net, insert_failed);
                 goto dying;
         }
  
@@ -1009,13 +1152,8 @@ __nf_conntrack_confirm(struct sk_buff *skb)
            setting time, otherwise we'd get timer wrap in
            weird delay cases. */
         ct->timeout += nfct_time_stamp;
-       atomic_inc(&ct->ct_general.use);
-       ct->status |= IPS_CONFIRMED;
  
-       /* set conntrack timestamp, if enabled. */
-       tstamp = nf_conn_tstamp_find(ct);
-       if (tstamp)
-               tstamp->start = ktime_get_real_ns();
+       __nf_conntrack_insert_prepare(ct);
  
         /* Since the lookup is lockless, hash insertion must be done after
          * starting the timer and setting the CONFIRMED bit. The RCU barriers
@@ -1035,11 +1173,9 @@ __nf_conntrack_confirm(struct sk_buff *skb)
         return NF_ACCEPT;
  
  out:
-       nf_ct_add_to_dying_list(ct);
-       ret = nf_ct_resolve_clash(net, skb, ctinfo, h);
+       ret = nf_ct_resolve_clash(skb, h, reply_hash);
  dying:
         nf_conntrack_double_unlock(hash, reply_hash);
-       NF_CT_STAT_INC(net, insert_failed);
         local_bh_enable();
         return ret;
  }
diff --git a/net/netfilter/nf_conntrack_proto_udp.c b/net/netfilter/nf_conntrack_proto_udp.c

index 7365b43f8f980edb267835006c8d7388ab450336..760ca242281655590ddf0a20ec25c8b73930e06f 100644 (file)
--- a/net/netfilter/nf_conntrack_proto_udp.c
+++ b/net/netfilter/nf_conntrack_proto_udp.c
@@ -81,6 +81,18 @@ static bool udp_error(struct sk_buff *skb,
         return false;
  }
  
+static void nf_conntrack_udp_refresh_unreplied(struct nf_conn *ct,
+                                              struct sk_buff *skb,
+                                              enum ip_conntrack_info ctinfo,
+                                              u32 extra_jiffies)
+{
+       if (unlikely(ctinfo == IP_CT_ESTABLISHED_REPLY &&
+                    ct->status & IPS_NAT_CLASH))
+               nf_ct_kill(ct);
+       else
+               nf_ct_refresh_acct(ct, ctinfo, skb, extra_jiffies);
+}
+
  /* Returns verdict for packet, and may modify conntracktype */
  int nf_conntrack_udp_packet(struct nf_conn *ct,
                             struct sk_buff *skb,
@@ -116,8 +128,8 @@ int nf_conntrack_udp_packet(struct nf_conn *ct,
                 if (!test_and_set_bit(IPS_ASSURED_BIT, &ct->status))
                         nf_conntrack_event_cache(IPCT_ASSURED, ct);
         } else {
-               nf_ct_refresh_acct(ct, ctinfo, skb,
-                                  timeouts[UDP_CT_UNREPLIED]);
+               nf_conntrack_udp_refresh_unreplied(ct, skb, ctinfo,
+                                                  timeouts[UDP_CT_UNREPLIED]);
         }
         return NF_ACCEPT;
  }
@@ -198,8 +210,8 @@ int nf_conntrack_udplite_packet(struct nf_conn *ct,
                 if (!test_and_set_bit(IPS_ASSURED_BIT, &ct->status))
                         nf_conntrack_event_cache(IPCT_ASSURED, ct);
         } else {
-               nf_ct_refresh_acct(ct, ctinfo, skb,
-                                  timeouts[UDP_CT_UNREPLIED]);
+               nf_conntrack_udp_refresh_unreplied(ct, skb, ctinfo,
+                                                  timeouts[UDP_CT_UNREPLIED]);
         }
         return NF_ACCEPT;
  }
diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c

index 83e1db37c3b041f7872624d05dddd78a8b980404..06f00cdc389100fb197abba44e7fd208d4da23b8 100644 (file)
--- a/net/netfilter/nf_flow_table_offload.c
+++ b/net/netfilter/nf_flow_table_offload.c
@@ -847,9 +847,6 @@ static int nf_flow_table_offload_cmd(struct flow_block_offload *bo,
  {
         int err;
  
-       if (!nf_flowtable_hw_offload(flowtable))
-               return 0;
-
         if (!dev->netdev_ops->ndo_setup_tc)
                 return -EOPNOTSUPP;
  
@@ -876,6 +873,9 @@ int nf_flow_table_offload_setup(struct nf_flowtable *flowtable,
         struct flow_block_offload bo;
         int err;
  
+       if (!nf_flowtable_hw_offload(flowtable))
+               return 0;
+
         err = nf_flow_table_offload_cmd(&bo, flowtable, dev, cmd, &extack);
         if (err < 0)
                 return err;
diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c

index f0cb1e13af5081216d799057ed5da4a314e0c4c7..4fc0c924ed5da0cc7adb99d4ba1c0ec8df636f1c 100644 (file)
--- a/net/netfilter/nft_set_pipapo.c
+++ b/net/netfilter/nft_set_pipapo.c
@@ -203,7 +203,7 @@
   * ::
   *
   *       rule indices in last field:    0    1
- *       map to elements:             0x42  0x66
+ *       map to elements:             0x66  0x42
   *
   *
   * Matching
@@ -298,7 +298,7 @@
   * ::
   *
   *       rule indices in last field:    0    1
- *       map to elements:             0x42  0x66
+ *       map to elements:             0x66  0x42
   *
   *      the matching element is at 0x42.
   *
@@ -503,7 +503,7 @@ static int pipapo_refill(unsigned long *map, int len, int rules,
                                 return -1;
                         }
  
-                       if (unlikely(match_only)) {
+                       if (match_only) {
                                 bitmap_clear(map, i, 1);
                                 return i;
                         }
@@ -1766,11 +1766,13 @@ static bool pipapo_match_field(struct nft_pipapo_field *f,
  static void nft_pipapo_remove(const struct net *net, const struct nft_set *set,
                               const struct nft_set_elem *elem)
  {
-       const u8 *data = (const u8 *)elem->key.val.data;
         struct nft_pipapo *priv = nft_set_priv(set);
         struct nft_pipapo_match *m = priv->clone;
+       struct nft_pipapo_elem *e = elem->priv;
         int rules_f0, first_rule = 0;
-       struct nft_pipapo_elem *e;
+       const u8 *data;
+
+       data = (const u8 *)nft_set_ext_key(&e->ext);
  
         e = pipapo_get(net, set, data, 0);
         if (IS_ERR(e))
diff --git a/net/netfilter/xt_hashlimit.c b/net/netfilter/xt_hashlimit.c

index bccd47cd7190810af06dea672842be7f58f72a4e..8c835ad637290efc7505bcfd64507c731a94e364 100644 (file)
--- a/net/netfilter/xt_hashlimit.c
+++ b/net/netfilter/xt_hashlimit.c
@@ -36,6 +36,7 @@
  #include <linux/netfilter_ipv6/ip6_tables.h>
  #include <linux/mutex.h>
  #include <linux/kernel.h>
+#include <linux/refcount.h>
  #include <uapi/linux/netfilter/xt_hashlimit.h>
  
  #define XT_HASHLIMIT_ALL (XT_HASHLIMIT_HASH_DIP | XT_HASHLIMIT_HASH_DPT | \
@@ -114,7 +115,7 @@ struct dsthash_ent {
  
  struct xt_hashlimit_htable {
         struct hlist_node node;         /* global list of all htables */
-       int use;
+       refcount_t use;
         u_int8_t family;
         bool rnd_initialized;
  
@@ -315,7 +316,7 @@ static int htable_create(struct net *net, struct hashlimit_cfg3 *cfg,
         for (i = 0; i < hinfo->cfg.size; i++)
                 INIT_HLIST_HEAD(&hinfo->hash[i]);
  
-       hinfo->use = 1;
+       refcount_set(&hinfo->use, 1);
         hinfo->count = 0;
         hinfo->family = family;
         hinfo->rnd_initialized = false;
@@ -401,15 +402,6 @@ static void htable_remove_proc_entry(struct xt_hashlimit_htable *hinfo)
                 remove_proc_entry(hinfo->name, parent);
  }
  
-static void htable_destroy(struct xt_hashlimit_htable *hinfo)
-{
-       cancel_delayed_work_sync(&hinfo->gc_work);
-       htable_remove_proc_entry(hinfo);
-       htable_selective_cleanup(hinfo, true);
-       kfree(hinfo->name);
-       vfree(hinfo);
-}
-
  static struct xt_hashlimit_htable *htable_find_get(struct net *net,
                                                    const char *name,
                                                    u_int8_t family)
@@ -420,7 +412,7 @@ static struct xt_hashlimit_htable *htable_find_get(struct net *net,
         hlist_for_each_entry(hinfo, &hashlimit_net->htables, node) {
                 if (!strcmp(name, hinfo->name) &&
                     hinfo->family == family) {
-                       hinfo->use++;
+                       refcount_inc(&hinfo->use);
                         return hinfo;
                 }
         }
@@ -429,12 +421,16 @@ static struct xt_hashlimit_htable *htable_find_get(struct net *net,
  
  static void htable_put(struct xt_hashlimit_htable *hinfo)
  {
-       mutex_lock(&hashlimit_mutex);
-       if (--hinfo->use == 0) {
+       if (refcount_dec_and_mutex_lock(&hinfo->use, &hashlimit_mutex)) {
                 hlist_del(&hinfo->node);
-               htable_destroy(hinfo);
+               htable_remove_proc_entry(hinfo);
+               mutex_unlock(&hashlimit_mutex);
+
+               cancel_delayed_work_sync(&hinfo->gc_work);
+               htable_selective_cleanup(hinfo, true);
+               kfree(hinfo->name);
+               vfree(hinfo);
         }
-       mutex_unlock(&hashlimit_mutex);
  }
  
  /* The algorithm used is the Simple Token Bucket Filter (TBF)
@@ -837,6 +833,8 @@ hashlimit_mt(const struct sk_buff *skb, struct xt_action_param *par)
         return hashlimit_mt_common(skb, par, hinfo, &info->cfg, 3);
  }
  
+#define HASHLIMIT_MAX_SIZE 1048576
+
  static int hashlimit_mt_check_common(const struct xt_mtchk_param *par,
                                      struct xt_hashlimit_htable **hinfo,
                                      struct hashlimit_cfg3 *cfg,
@@ -847,6 +845,14 @@ static int hashlimit_mt_check_common(const struct xt_mtchk_param *par,
  
         if (cfg->gc_interval == 0 || cfg->expire == 0)
                 return -EINVAL;
+       if (cfg->size > HASHLIMIT_MAX_SIZE) {
+               cfg->size = HASHLIMIT_MAX_SIZE;
+               pr_info_ratelimited("size too large, truncated to %u\n", cfg->size);
+       }
+       if (cfg->max > HASHLIMIT_MAX_SIZE) {
+               cfg->max = HASHLIMIT_MAX_SIZE;
+               pr_info_ratelimited("max too large, truncated to %u\n", cfg->max);
+       }
         if (par->family == NFPROTO_IPV4) {
                 if (cfg->srcmask > 32 || cfg->dstmask > 32)
                         return -EINVAL;
diff --git a/net/netlabel/netlabel_domainhash.c b/net/netlabel/netlabel_domainhash.c

index f5d34da0646eda1647a39300252ff1249c8b01eb..a1f2320ecc16d1450525123fe6f217ddb8203b76 100644 (file)
--- a/net/netlabel/netlabel_domainhash.c
+++ b/net/netlabel/netlabel_domainhash.c
@@ -143,7 +143,8 @@ static struct netlbl_dom_map *netlbl_domhsh_search(const char *domain,
         if (domain != NULL) {
                 bkt = netlbl_domhsh_hash(domain);
                 bkt_list = &netlbl_domhsh_rcu_deref(netlbl_domhsh)->tbl[bkt];
-               list_for_each_entry_rcu(iter, bkt_list, list)
+               list_for_each_entry_rcu(iter, bkt_list, list,
+                                       lockdep_is_held(&netlbl_domhsh_lock))
                         if (iter->valid &&
                             netlbl_family_match(iter->family, family) &&
                             strcmp(iter->domain, domain) == 0)
diff --git a/net/netlabel/netlabel_unlabeled.c b/net/netlabel/netlabel_unlabeled.c

index d2e4ab8d1cb1008da0a9577667d1206d4c0bb309..77bb1bb22c3bfced3a6302d4241f5a0cdbd5d069 100644 (file)
--- a/net/netlabel/netlabel_unlabeled.c
+++ b/net/netlabel/netlabel_unlabeled.c
@@ -207,7 +207,8 @@ static struct netlbl_unlhsh_iface *netlbl_unlhsh_search_iface(int ifindex)
  
         bkt = netlbl_unlhsh_hash(ifindex);
         bkt_list = &netlbl_unlhsh_rcu_deref(netlbl_unlhsh)->tbl[bkt];
-       list_for_each_entry_rcu(iter, bkt_list, list)
+       list_for_each_entry_rcu(iter, bkt_list, list,
+                               lockdep_is_held(&netlbl_unlhsh_lock))
                 if (iter->valid && iter->ifindex == ifindex)
                         return iter;
  
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c

index 4e31721e729360c8bf555186ab6d4aa67cb00280..edf3e285e242877d78b044bac89b4a41804b56cb 100644 (file)
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1014,7 +1014,8 @@ static int netlink_bind(struct socket *sock, struct sockaddr *addr,
         if (nlk->netlink_bind && groups) {
                 int group;
  
-               for (group = 0; group < nlk->ngroups; group++) {
+               /* nl_groups is a u32, so cap the maximum groups we can bind */
+               for (group = 0; group < BITS_PER_TYPE(u32); group++) {
                         if (!test_bit(group, &groups))
                                 continue;
                         err = nlk->netlink_bind(net, group + 1);
@@ -1033,7 +1034,7 @@ static int netlink_bind(struct socket *sock, struct sockaddr *addr,
                         netlink_insert(sk, nladdr->nl_pid) :
                         netlink_autobind(sock);
                 if (err) {
-                       netlink_undo_bind(nlk->ngroups, groups, sk);
+                       netlink_undo_bind(BITS_PER_TYPE(u32), groups, sk);
                         goto unlock;
                 }
         }
diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c

index 0522b2b1fd958ce17de51163df8505d2b754d988..9f357aa22b9452d17f5c391a159d263c4f3df552 100644 (file)
--- a/net/netlink/genetlink.c
+++ b/net/netlink/genetlink.c
@@ -497,8 +497,9 @@ genl_family_rcv_msg_attrs_parse(const struct genl_family *family,
  
         err = __nlmsg_parse(nlh, hdrlen, attrbuf, family->maxattr,
                             family->policy, validate, extack);
-       if (err && parallel) {
-               kfree(attrbuf);
+       if (err) {
+               if (parallel)
+                       kfree(attrbuf);
                 return ERR_PTR(err);
         }
         return attrbuf;
diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c

index 659c2a790fe7cf919d3b5b6af3bd41ffde1211c7..c047afd121160f4fd66d4fcd135eb15dd2dbc6e7 100644 (file)
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -179,7 +179,8 @@ struct vport *ovs_lookup_vport(const struct datapath *dp, u16 port_no)
         struct hlist_head *head;
  
         head = vport_hash_bucket(dp, port_no);
-       hlist_for_each_entry_rcu(vport, head, dp_hash_node) {
+       hlist_for_each_entry_rcu(vport, head, dp_hash_node,
+                               lockdep_ovsl_is_held()) {
                 if (vport->port_no == port_no)
                         return vport;
         }
@@ -2042,7 +2043,8 @@ static unsigned int ovs_get_max_headroom(struct datapath *dp)
         int i;
  
         for (i = 0; i < DP_VPORT_HASH_BUCKETS; i++) {
-               hlist_for_each_entry_rcu(vport, &dp->ports[i], dp_hash_node) {
+               hlist_for_each_entry_rcu(vport, &dp->ports[i], dp_hash_node,
+                                       lockdep_ovsl_is_held()) {
                         dev = vport->dev;
                         dev_headroom = netdev_get_fwd_headroom(dev);
                         if (dev_headroom > max_headroom)
@@ -2061,7 +2063,8 @@ static void ovs_update_headroom(struct datapath *dp, unsigned int new_headroom)
  
         dp->max_headroom = new_headroom;
         for (i = 0; i < DP_VPORT_HASH_BUCKETS; i++)
-               hlist_for_each_entry_rcu(vport, &dp->ports[i], dp_hash_node)
+               hlist_for_each_entry_rcu(vport, &dp->ports[i], dp_hash_node,
+                                       lockdep_ovsl_is_held())
                         netdev_set_rx_headroom(vport->dev, new_headroom);
  }
  
diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c

index 7da4230627f51489707fc199c4333702c344282e..288122eec7c838abcadcd0c106a719675c84471e 100644 (file)
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -2708,10 +2708,6 @@ static int validate_set(const struct nlattr *a,
                 return -EINVAL;
  
         switch (key_type) {
-       const struct ovs_key_ipv4 *ipv4_key;
-       const struct ovs_key_ipv6 *ipv6_key;
-       int err;
-
         case OVS_KEY_ATTR_PRIORITY:
         case OVS_KEY_ATTR_SKB_MARK:
         case OVS_KEY_ATTR_CT_MARK:
@@ -2723,7 +2719,9 @@ static int validate_set(const struct nlattr *a,
                         return -EINVAL;
                 break;
  
-       case OVS_KEY_ATTR_TUNNEL:
+       case OVS_KEY_ATTR_TUNNEL: {
+               int err;
+
                 if (masked)
                         return -EINVAL; /* Masked tunnel set not supported. */
  
@@ -2732,8 +2730,10 @@ static int validate_set(const struct nlattr *a,
                 if (err)
                         return err;
                 break;
+       }
+       case OVS_KEY_ATTR_IPV4: {
+               const struct ovs_key_ipv4 *ipv4_key;
  
-       case OVS_KEY_ATTR_IPV4:
                 if (eth_type != htons(ETH_P_IP))
                         return -EINVAL;
  
@@ -2753,8 +2753,10 @@ static int validate_set(const struct nlattr *a,
                                 return -EINVAL;
                 }
                 break;
+       }
+       case OVS_KEY_ATTR_IPV6: {
+               const struct ovs_key_ipv6 *ipv6_key;
  
-       case OVS_KEY_ATTR_IPV6:
                 if (eth_type != htons(ETH_P_IPV6))
                         return -EINVAL;
  
@@ -2781,7 +2783,7 @@ static int validate_set(const struct nlattr *a,
                         return -EINVAL;
  
                 break;
-
+       }
         case OVS_KEY_ATTR_TCP:
                 if ((eth_type != htons(ETH_P_IP) &&
                      eth_type != htons(ETH_P_IPV6)) ||
diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c

index 5904e93e57656d80632b00e288292706e8597ef9..fd8a01ca7a2d53d72766d5234e1f133caf4be68e 100644 (file)
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -585,7 +585,8 @@ static struct sw_flow *masked_flow_lookup(struct table_instance *ti,
         head = find_bucket(ti, hash);
         (*n_mask_hit)++;
  
-       hlist_for_each_entry_rcu(flow, head, flow_table.node[ti->node_ver]) {
+       hlist_for_each_entry_rcu(flow, head, flow_table.node[ti->node_ver],
+                               lockdep_ovsl_is_held()) {
                 if (flow->mask == mask && flow->flow_table.hash == hash &&
                     flow_cmp_masked_key(flow, &masked_key, &mask->range))
                         return flow;
@@ -769,7 +770,8 @@ struct sw_flow *ovs_flow_tbl_lookup_ufid(struct flow_table *tbl,
  
         hash = ufid_hash(ufid);
         head = find_bucket(ti, hash);
-       hlist_for_each_entry_rcu(flow, head, ufid_table.node[ti->node_ver]) {
+       hlist_for_each_entry_rcu(flow, head, ufid_table.node[ti->node_ver],
+                               lockdep_ovsl_is_held()) {
                 if (flow->ufid_table.hash == hash &&
                     ovs_flow_cmp_ufid(flow, ufid))
                         return flow;
diff --git a/net/openvswitch/meter.c b/net/openvswitch/meter.c

index 3323b79ff548dfbf7842b3ac3b825b7d0f7dcfde..5010d1ddd4bdb1f8b5fd34dbc97c7b3803552563 100644 (file)
--- a/net/openvswitch/meter.c
+++ b/net/openvswitch/meter.c
@@ -61,7 +61,8 @@ static struct dp_meter *lookup_meter(const struct datapath *dp,
         struct hlist_head *head;
  
         head = meter_hash_bucket(dp, meter_id);
-       hlist_for_each_entry_rcu(meter, head, dp_hash_node) {
+       hlist_for_each_entry_rcu(meter, head, dp_hash_node,
+                               lockdep_ovsl_is_held()) {
                 if (meter->id == meter_id)
                         return meter;
         }
diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c

index 5da9392b03d624c2ae56b67366d3922e0000b079..47febb4504f098d2be91a1072fdd1daac71a9462 100644 (file)
--- a/net/openvswitch/vport.c
+++ b/net/openvswitch/vport.c
@@ -96,7 +96,8 @@ struct vport *ovs_vport_locate(const struct net *net, const char *name)
         struct hlist_head *bucket = hash_bucket(net, name);
         struct vport *vport;
  
-       hlist_for_each_entry_rcu(vport, bucket, hash_node)
+       hlist_for_each_entry_rcu(vport, bucket, hash_node,
+                               lockdep_ovsl_is_held())
                 if (!strcmp(name, ovs_vport_name(vport)) &&
                     net_eq(ovs_dp_get_net(vport->dp), net))
                         return vport;
diff --git a/net/rds/rdma.c b/net/rds/rdma.c

index 3341eee87bf9b3c8bad89b88ca0e8fcd74186539..585e6b3b69ce4c32ec9404ee5ad88065b0979f75 100644 (file)
--- a/net/rds/rdma.c
+++ b/net/rds/rdma.c
@@ -162,10 +162,9 @@ static int rds_pin_pages(unsigned long user_addr, unsigned int nr_pages,
         if (write)
                 gup_flags |= FOLL_WRITE;
  
-       ret = get_user_pages_fast(user_addr, nr_pages, gup_flags, pages);
+       ret = pin_user_pages_fast(user_addr, nr_pages, gup_flags, pages);
         if (ret >= 0 && ret < nr_pages) {
-               while (ret--)
-                       put_page(pages[ret]);
+               unpin_user_pages(pages, ret);
                 ret = -EFAULT;
         }
  
@@ -300,8 +299,7 @@ static int __rds_rdma_map(struct rds_sock *rs, struct rds_get_mr_args *args,
                  * to release anything.
                  */
                 if (!need_odp) {
-                       for (i = 0 ; i < nents; i++)
-                               put_page(sg_page(&sg[i]));
+                       unpin_user_pages(pages, nr_pages);
                         kfree(sg);
                 }
                 ret = PTR_ERR(trans_private);
@@ -325,7 +323,12 @@ static int __rds_rdma_map(struct rds_sock *rs, struct rds_get_mr_args *args,
         if (cookie_ret)
                 *cookie_ret = cookie;
  
-       if (args->cookie_addr && put_user(cookie, (u64 __user *)(unsigned long) args->cookie_addr)) {
+       if (args->cookie_addr &&
+           put_user(cookie, (u64 __user *)(unsigned long)args->cookie_addr)) {
+               if (!need_odp) {
+                       unpin_user_pages(pages, nr_pages);
+                       kfree(sg);
+               }
                 ret = -EFAULT;
                 goto out;
         }
@@ -496,9 +499,7 @@ void rds_rdma_free_op(struct rm_rdma_op *ro)
                          * is the case for a RDMA_READ which copies from remote
                          * to local memory
                          */
-                       if (!ro->op_write)
-                               set_page_dirty(page);
-                       put_page(page);
+                       unpin_user_pages_dirty_lock(&page, 1, !ro->op_write);
                 }
         }
  
@@ -515,8 +516,7 @@ void rds_atomic_free_op(struct rm_atomic_op *ao)
         /* Mark page dirty if it was possibly modified, which
          * is the case for a RDMA_READ which copies from remote
          * to local memory */
-       set_page_dirty(page);
-       put_page(page);
+       unpin_user_pages_dirty_lock(&page, 1, true);
  
         kfree(ao->op_notifier);
         ao->op_notifier = NULL;
@@ -944,7 +944,7 @@ int rds_cmsg_atomic(struct rds_sock *rs, struct rds_message *rm,
         return ret;
  err:
         if (page)
-               put_page(page);
+               unpin_user_page(page);
         rm->atomic.op_active = 0;
         kfree(rm->atomic.op_notifier);
  
diff --git a/net/sched/act_api.c b/net/sched/act_api.c

index 90a31b15585f61a5a3c406eb3e3985679164e044..8c466a712cda047001973652603c4f0e0c4e2b4d 100644 (file)
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -186,6 +186,7 @@ static size_t tcf_action_shared_attrs_size(const struct tc_action *act)
                 + nla_total_size(IFNAMSIZ) /* TCA_ACT_KIND */
                 + cookie_len /* TCA_ACT_COOKIE */
                 + nla_total_size(0) /* TCA_ACT_STATS nested */
+               + nla_total_size(sizeof(struct nla_bitfield32)) /* TCA_ACT_FLAGS */
                 /* TCA_STATS_BASIC */
                 + nla_total_size_64bit(sizeof(struct gnet_stats_basic))
                 /* TCA_STATS_PKT64 */
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c

index f9c0d1e8d380152e4900e78e98c3edfca5deda29..d32d4233d33748c302600a229d207dcab3644266 100644 (file)
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -305,6 +305,7 @@ static int fl_classify(struct sk_buff *skb, const struct tcf_proto *tp,
         struct cls_fl_filter *f;
  
         list_for_each_entry_rcu(mask, &head->masks, list) {
+               flow_dissector_init_keys(&skb_key.control, &skb_key.basic);
                 fl_clear_masked_range(&skb_key, mask);
  
                 skb_flow_dissect_meta(skb, &mask->dissector, &skb_key);
@@ -691,6 +692,7 @@ static const struct nla_policy fl_policy[TCA_FLOWER_MAX + 1] = {
                                             .len = 128 / BITS_PER_BYTE },
         [TCA_FLOWER_KEY_CT_LABELS_MASK] = { .type = NLA_BINARY,
                                             .len = 128 / BITS_PER_BYTE },
+       [TCA_FLOWER_FLAGS]              = { .type = NLA_U32 },
  };
  
  static const struct nla_policy
diff --git a/net/sched/cls_matchall.c b/net/sched/cls_matchall.c

index 039cc86974f4583472583c0260031ace609501b2..610a0b728161a32385465aff9e35903c302341a7 100644 (file)
--- a/net/sched/cls_matchall.c
+++ b/net/sched/cls_matchall.c
@@ -157,6 +157,7 @@ static void *mall_get(struct tcf_proto *tp, u32 handle)
  static const struct nla_policy mall_policy[TCA_MATCHALL_MAX + 1] = {
         [TCA_MATCHALL_UNSPEC]           = { .type = NLA_UNSPEC },
         [TCA_MATCHALL_CLASSID]          = { .type = NLA_U32 },
+       [TCA_MATCHALL_FLAGS]            = { .type = NLA_U32 },
  };
  
  static int mall_set_parms(struct net *net, struct tcf_proto *tp,
diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c

index 748e3b19ec1dc2c7d8ed1dbc83193b7f077c7eb2..6a16af4b1ef61810fed8a04169109f311b070575 100644 (file)
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -170,6 +170,16 @@ static inline bool sctp_chunk_length_valid(struct sctp_chunk *chunk,
         return true;
  }
  
+/* Check for format error in an ABORT chunk */
+static inline bool sctp_err_chunk_valid(struct sctp_chunk *chunk)
+{
+       struct sctp_errhdr *err;
+
+       sctp_walk_errors(err, chunk->chunk_hdr);
+
+       return (void *)err == (void *)chunk->chunk_end;
+}
+
  /**********************************************************
   * These are the state functions for handling chunk events.
   **********************************************************/
@@ -2255,6 +2265,9 @@ enum sctp_disposition sctp_sf_shutdown_pending_abort(
                     sctp_bind_addr_state(&asoc->base.bind_addr, &chunk->dest))
                 return sctp_sf_discard_chunk(net, ep, asoc, type, arg, commands);
  
+       if (!sctp_err_chunk_valid(chunk))
+               return sctp_sf_pdiscard(net, ep, asoc, type, arg, commands);
+
         return __sctp_sf_do_9_1_abort(net, ep, asoc, type, arg, commands);
  }
  
@@ -2298,6 +2311,9 @@ enum sctp_disposition sctp_sf_shutdown_sent_abort(
                     sctp_bind_addr_state(&asoc->base.bind_addr, &chunk->dest))
                 return sctp_sf_discard_chunk(net, ep, asoc, type, arg, commands);
  
+       if (!sctp_err_chunk_valid(chunk))
+               return sctp_sf_pdiscard(net, ep, asoc, type, arg, commands);
+
         /* Stop the T2-shutdown timer. */
         sctp_add_cmd_sf(commands, SCTP_CMD_TIMER_STOP,
                         SCTP_TO(SCTP_EVENT_TIMEOUT_T2_SHUTDOWN));
@@ -2565,6 +2581,9 @@ enum sctp_disposition sctp_sf_do_9_1_abort(
                     sctp_bind_addr_state(&asoc->base.bind_addr, &chunk->dest))
                 return sctp_sf_discard_chunk(net, ep, asoc, type, arg, commands);
  
+       if (!sctp_err_chunk_valid(chunk))
+               return sctp_sf_pdiscard(net, ep, asoc, type, arg, commands);
+
         return __sctp_sf_do_9_1_abort(net, ep, asoc, type, arg, commands);
  }
  
@@ -2582,16 +2601,8 @@ static enum sctp_disposition __sctp_sf_do_9_1_abort(
  
         /* See if we have an error cause code in the chunk.  */
         len = ntohs(chunk->chunk_hdr->length);
-       if (len >= sizeof(struct sctp_chunkhdr) + sizeof(struct sctp_errhdr)) {
-               struct sctp_errhdr *err;
-
-               sctp_walk_errors(err, chunk->chunk_hdr);
-               if ((void *)err != (void *)chunk->chunk_end)
-                       return sctp_sf_pdiscard(net, ep, asoc, type, arg,
-                                               commands);
-
+       if (len >= sizeof(struct sctp_chunkhdr) + sizeof(struct sctp_errhdr))
                 error = ((struct sctp_errhdr *)chunk->skb->data)->cause;
-       }
  
         sctp_add_cmd_sf(commands, SCTP_CMD_SET_SK_ERR, SCTP_ERROR(ECONNRESET));
         /* ASSOC_FAILED will DELETE_TCB. */
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c

index cee5bf4a9bb95a517861ddebc0c4deaf603da245..6fd44bdb0fc3ebb8d9b32e5aebbb494267d71b27 100644 (file)
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -470,6 +470,8 @@ static void smc_switch_to_fallback(struct smc_sock *smc)
         if (smc->sk.sk_socket && smc->sk.sk_socket->file) {
                 smc->clcsock->file = smc->sk.sk_socket->file;
                 smc->clcsock->file->private_data = smc->clcsock;
+               smc->clcsock->wq.fasync_list =
+                       smc->sk.sk_socket->wq.fasync_list;
         }
  }
  
@@ -510,15 +512,18 @@ static int smc_connect_decline_fallback(struct smc_sock *smc, int reason_code)
  static int smc_connect_abort(struct smc_sock *smc, int reason_code,
                              int local_contact)
  {
+       bool is_smcd = smc->conn.lgr->is_smcd;
+
         if (local_contact == SMC_FIRST_CONTACT)
-               smc_lgr_forget(smc->conn.lgr);
-       if (smc->conn.lgr->is_smcd)
+               smc_lgr_cleanup_early(&smc->conn);
+       else
+               smc_conn_free(&smc->conn);
+       if (is_smcd)
                 /* there is only one lgr role for SMC-D; use server lock */
                 mutex_unlock(&smc_server_lgr_pending);
         else
                 mutex_unlock(&smc_client_lgr_pending);
  
-       smc_conn_free(&smc->conn);
         smc->connect_nonblock = 0;
         return reason_code;
  }
@@ -1089,7 +1094,6 @@ static void smc_listen_out_err(struct smc_sock *new_smc)
         if (newsmcsk->sk_state == SMC_INIT)
                 sock_put(&new_smc->sk); /* passive closing */
         newsmcsk->sk_state = SMC_CLOSED;
-       smc_conn_free(&new_smc->conn);
  
         smc_listen_out(new_smc);
  }
@@ -1100,12 +1104,13 @@ static void smc_listen_decline(struct smc_sock *new_smc, int reason_code,
  {
         /* RDMA setup failed, switch back to TCP */
         if (local_contact == SMC_FIRST_CONTACT)
-               smc_lgr_forget(new_smc->conn.lgr);
+               smc_lgr_cleanup_early(&new_smc->conn);
+       else
+               smc_conn_free(&new_smc->conn);
         if (reason_code < 0) { /* error, no fallback possible */
                 smc_listen_out_err(new_smc);
                 return;
         }
-       smc_conn_free(&new_smc->conn);
         smc_switch_to_fallback(new_smc);
         new_smc->fallback_rsn = reason_code;
         if (reason_code && reason_code != SMC_CLC_DECL_PEERDECL) {
@@ -1168,16 +1173,18 @@ static int smc_listen_ism_init(struct smc_sock *new_smc,
                             new_smc->conn.lgr->vlan_id,
                             new_smc->conn.lgr->smcd)) {
                 if (ini->cln_first_contact == SMC_FIRST_CONTACT)
-                       smc_lgr_forget(new_smc->conn.lgr);
-               smc_conn_free(&new_smc->conn);
+                       smc_lgr_cleanup_early(&new_smc->conn);
+               else
+                       smc_conn_free(&new_smc->conn);
                 return SMC_CLC_DECL_SMCDNOTALK;
         }
  
         /* Create send and receive buffers */
         if (smc_buf_create(new_smc, true)) {
                 if (ini->cln_first_contact == SMC_FIRST_CONTACT)
-                       smc_lgr_forget(new_smc->conn.lgr);
-               smc_conn_free(&new_smc->conn);
+                       smc_lgr_cleanup_early(&new_smc->conn);
+               else
+                       smc_conn_free(&new_smc->conn);
                 return SMC_CLC_DECL_MEM;
         }
  
diff --git a/net/smc/smc_clc.c b/net/smc/smc_clc.c

index 0879f7bed96752c6832d1777a31ff633ae4b1aec..86cccc24e52e2d047c5617af4421902e8dd6c747 100644 (file)
--- a/net/smc/smc_clc.c
+++ b/net/smc/smc_clc.c
@@ -372,7 +372,9 @@ int smc_clc_send_decline(struct smc_sock *smc, u32 peer_diag_info)
         dclc.hdr.length = htons(sizeof(struct smc_clc_msg_decline));
         dclc.hdr.version = SMC_CLC_V1;
         dclc.hdr.flag = (peer_diag_info == SMC_CLC_DECL_SYNCERR) ? 1 : 0;
-       memcpy(dclc.id_for_peer, local_systemid, sizeof(local_systemid));
+       if (smc->conn.lgr && !smc->conn.lgr->is_smcd)
+               memcpy(dclc.id_for_peer, local_systemid,
+                      sizeof(local_systemid));
         dclc.peer_diagnosis = htonl(peer_diag_info);
         memcpy(dclc.trl.eyecatcher, SMC_EYECATCHER, sizeof(SMC_EYECATCHER));
  
diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c

index 2249de5379ee900301cc9fc34cf84b51085399f5..5b085efa3bce49e4cc10185c90d61ef1ece7c5c0 100644 (file)
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -162,6 +162,18 @@ static void smc_lgr_unregister_conn(struct smc_connection *conn)
         conn->lgr = NULL;
  }
  
+void smc_lgr_cleanup_early(struct smc_connection *conn)
+{
+       struct smc_link_group *lgr = conn->lgr;
+
+       if (!lgr)
+               return;
+
+       smc_conn_free(conn);
+       smc_lgr_forget(lgr);
+       smc_lgr_schedule_free_work_fast(lgr);
+}
+
  /* Send delete link, either as client to request the initiation
   * of the DELETE LINK sequence from server; or as server to
   * initiate the delete processing. See smc_llc_rx_delete_link().
diff --git a/net/smc/smc_core.h b/net/smc/smc_core.h

index c472e12951d1abbf52c8ab44f041b3042605b5e6..234ae25f0025f1ea641ce20868d0dd9e9bf8e814 100644 (file)
--- a/net/smc/smc_core.h
+++ b/net/smc/smc_core.h
@@ -296,6 +296,7 @@ struct smc_clc_msg_accept_confirm;
  struct smc_clc_msg_local;
  
  void smc_lgr_forget(struct smc_link_group *lgr);
+void smc_lgr_cleanup_early(struct smc_connection *conn);
  void smc_lgr_terminate(struct smc_link_group *lgr, bool soft);
  void smc_port_terminate(struct smc_ib_device *smcibdev, u8 ibport);
  void smc_smcd_terminate(struct smcd_dev *dev, u64 peer_gid,
@@ -316,7 +317,6 @@ int smc_vlan_by_tcpsk(struct socket *clcsock, struct smc_init_info *ini);
  
  void smc_conn_free(struct smc_connection *conn);
  int smc_conn_create(struct smc_sock *smc, struct smc_init_info *ini);
-void smcd_conn_free(struct smc_connection *conn);
  void smc_lgr_schedule_free_work_fast(struct smc_link_group *lgr);
  int smc_core_init(void);
  void smc_core_exit(void);
diff --git a/net/smc/smc_diag.c b/net/smc/smc_diag.c

index f38727ecf8b220b398f3ef622df1eccd03da1c56..e1f64f4ba2361432a58ce1d19ec0ecd41462971e 100644 (file)
--- a/net/smc/smc_diag.c
+++ b/net/smc/smc_diag.c
@@ -39,16 +39,15 @@ static void smc_diag_msg_common_fill(struct smc_diag_msg *r, struct sock *sk)
  {
         struct smc_sock *smc = smc_sk(sk);
  
+       memset(r, 0, sizeof(*r));
         r->diag_family = sk->sk_family;
+       sock_diag_save_cookie(sk, r->id.idiag_cookie);
         if (!smc->clcsock)
                 return;
         r->id.idiag_sport = htons(smc->clcsock->sk->sk_num);
         r->id.idiag_dport = smc->clcsock->sk->sk_dport;
         r->id.idiag_if = smc->clcsock->sk->sk_bound_dev_if;
-       sock_diag_save_cookie(sk, r->id.idiag_cookie);
         if (sk->sk_protocol == SMCPROTO_SMC) {
-               memset(&r->id.idiag_src, 0, sizeof(r->id.idiag_src));
-               memset(&r->id.idiag_dst, 0, sizeof(r->id.idiag_dst));
                 r->id.idiag_src[0] = smc->clcsock->sk->sk_rcv_saddr;
                 r->id.idiag_dst[0] = smc->clcsock->sk->sk_daddr;
  #if IS_ENABLED(CONFIG_IPV6)
diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c

index 548632621f4bc952b973c920af68f6477f0acc2b..d6ba186f67e2aa16d8f3822bea0950f0cb6612cc 100644 (file)
--- a/net/smc/smc_ib.c
+++ b/net/smc/smc_ib.c
@@ -573,6 +573,8 @@ static void smc_ib_remove_dev(struct ib_device *ibdev, void *client_data)
         struct smc_ib_device *smcibdev;
  
         smcibdev = ib_get_client_data(ibdev, &smc_ib_client);
+       if (!smcibdev || smcibdev->ibdev != ibdev)
+               return;
         ib_set_client_data(ibdev, &smc_ib_client, NULL);
         spin_lock(&smc_ib_devices.lock);
         list_del_init(&smcibdev->list); /* remove from smc_ib_devices */
diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c

index 095be887753e026035ee25381b47528cf41d40ac..125297c9aa3e761185b3ddee8d681a5841ede944 100644 (file)
--- a/net/sunrpc/xprtrdma/frwr_ops.c
+++ b/net/sunrpc/xprtrdma/frwr_ops.c
@@ -288,8 +288,8 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt *r_xprt,
  {
         struct rpcrdma_ia *ia = &r_xprt->rx_ia;
         struct ib_reg_wr *reg_wr;
+       int i, n, dma_nents;
         struct ib_mr *ibmr;
-       int i, n;
         u8 key;
  
         if (nsegs > ia->ri_max_frwr_depth)
@@ -313,15 +313,16 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt *r_xprt,
                         break;
         }
         mr->mr_dir = rpcrdma_data_dir(writing);
+       mr->mr_nents = i;
  
-       mr->mr_nents =
-               ib_dma_map_sg(ia->ri_id->device, mr->mr_sg, i, mr->mr_dir);
-       if (!mr->mr_nents)
+       dma_nents = ib_dma_map_sg(ia->ri_id->device, mr->mr_sg, mr->mr_nents,
+                                 mr->mr_dir);
+       if (!dma_nents)
                 goto out_dmamap_err;
  
         ibmr = mr->frwr.fr_mr;
-       n = ib_map_mr_sg(ibmr, mr->mr_sg, mr->mr_nents, NULL, PAGE_SIZE);
-       if (unlikely(n != mr->mr_nents))
+       n = ib_map_mr_sg(ibmr, mr->mr_sg, dma_nents, NULL, PAGE_SIZE);
+       if (n != dma_nents)
                 goto out_mapmr_err;
  
         ibmr->iova &= 0x00000000ffffffff;
diff --git a/net/tipc/node.c b/net/tipc/node.c

index 99b28b69fc174a09cd6a73db84e7149ba673d51a..0c88778c88b5f142772330123998978fc50284a8 100644 (file)
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -278,7 +278,7 @@ struct tipc_crypto *tipc_node_crypto_rx_by_list(struct list_head *pos)
  }
  #endif
  
-void tipc_node_free(struct rcu_head *rp)
+static void tipc_node_free(struct rcu_head *rp)
  {
         struct tipc_node *n = container_of(rp, struct tipc_node, rcu);
  
@@ -2798,7 +2798,7 @@ static int tipc_nl_retrieve_nodeid(struct nlattr **attrs, u8 **node_id)
         return 0;
  }
  
-int __tipc_nl_node_set_key(struct sk_buff *skb, struct genl_info *info)
+static int __tipc_nl_node_set_key(struct sk_buff *skb, struct genl_info *info)
  {
         struct nlattr *attrs[TIPC_NLA_NODE_MAX + 1];
         struct net *net = sock_net(skb->sk);
@@ -2875,7 +2875,8 @@ int tipc_nl_node_set_key(struct sk_buff *skb, struct genl_info *info)
         return err;
  }
  
-int __tipc_nl_node_flush_key(struct sk_buff *skb, struct genl_info *info)
+static int __tipc_nl_node_flush_key(struct sk_buff *skb,
+                                   struct genl_info *info)
  {
         struct net *net = sock_net(skb->sk);
         struct tipc_net *tn = tipc_net(net);
diff --git a/net/tipc/socket.c b/net/tipc/socket.c

index f9b4fb92c0b1c98f373bc2346e5d5e77c86d48f8..693e8902161efaa35f918dae4731a5c3a081c5fd 100644 (file)
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -2441,6 +2441,8 @@ static int tipc_wait_for_connect(struct socket *sock, long *timeo_p)
                         return -ETIMEDOUT;
                 if (signal_pending(current))
                         return sock_intr_errno(*timeo_p);
+               if (sk->sk_state == TIPC_DISCONNECTING)
+                       break;
  
                 add_wait_queue(sk_sleep(sk), &wait);
                 done = sk_wait_event(sk, timeo_p, tipc_sk_connected(sk),
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c

index 1ba5a92832bb0e065e0e9b8ac6f742b24d1d13c5..1c5574e2e05825140fd4b2ffc2ecd636147aba63 100644 (file)
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -593,7 +593,7 @@ struct tls_record_info *tls_get_record(struct tls_offload_context_tx *context,
                                        u32 seq, u64 *p_record_sn)
  {
         u64 record_sn = context->hint_record_sn;
-       struct tls_record_info *info;
+       struct tls_record_info *info, *last;
  
         info = context->retransmit_hint;
         if (!info ||
@@ -605,6 +605,24 @@ struct tls_record_info *tls_get_record(struct tls_offload_context_tx *context,
                                                 struct tls_record_info, list);
                 if (!info)
                         return NULL;
+               /* send the start_marker record if seq number is before the
+                * tls offload start marker sequence number. This record is
+                * required to handle TCP packets which are before TLS offload
+                * started.
+                *  And if it's not start marker, look if this seq number
+                * belongs to the list.
+                */
+               if (likely(!tls_record_is_start_marker(info))) {
+                       /* we have the first record, get the last record to see
+                        * if this seq number belongs to the list.
+                        */
+                       last = list_last_entry(&context->records_list,
+                                              struct tls_record_info, list);
+
+                       if (!between(seq, tls_record_start_seq(info),
+                                    last->end_seq))
+                               return NULL;
+               }
                 record_sn = context->unacked_record_sn;
         }
  
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c

index 62c12cb5763e6d0ec4ef7daefd6836b6588b7185..68debcb28fa4c46eb2b75b0ec262958299eda960 100644 (file)
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -682,6 +682,7 @@ static int unix_set_peek_off(struct sock *sk, int val)
         return 0;
  }
  
+#ifdef CONFIG_PROC_FS
  static void unix_show_fdinfo(struct seq_file *m, struct socket *sock)
  {
         struct sock *sk = sock->sk;
@@ -692,6 +693,9 @@ static void unix_show_fdinfo(struct seq_file *m, struct socket *sock)
                 seq_printf(m, "scm_fds: %u\n", READ_ONCE(u->scm_stat.nr_fds));
         }
  }
+#else
+#define unix_show_fdinfo NULL
+#endif
  
  static const struct proto_ops unix_stream_ops = {
         .family =       PF_UNIX,
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c

index 9c5b2a91baad60945cd06881c57d1b0c56810a58..a5f28708e0e75402e595a38ee91c57e9637e0289 100644 (file)
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -451,6 +451,12 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
                 if (vsk->transport == new_transport)
                         return 0;
  
+               /* transport->release() must be called with sock lock acquired.
+                * This path can only be taken during vsock_stream_connect(),
+                * where we have already held the sock lock.
+                * In the other cases, this function is called on a new socket
+                * which is not assigned to any transport.
+                */
                 vsk->transport->release(vsk);
                 vsock_deassign_transport(vsk);
         }
@@ -753,20 +759,18 @@ static void __vsock_release(struct sock *sk, int level)
                 vsk = vsock_sk(sk);
                 pending = NULL; /* Compiler warning. */
  
-               /* The release call is supposed to use lock_sock_nested()
-                * rather than lock_sock(), if a sock lock should be acquired.
-                */
-               if (vsk->transport)
-                       vsk->transport->release(vsk);
-               else if (sk->sk_type == SOCK_STREAM)
-                       vsock_remove_sock(vsk);
-
                 /* When "level" is SINGLE_DEPTH_NESTING, use the nested
                  * version to avoid the warning "possible recursive locking
                  * detected". When "level" is 0, lock_sock_nested(sk, level)
                  * is the same as lock_sock(sk).
                  */
                 lock_sock_nested(sk, level);
+
+               if (vsk->transport)
+                       vsk->transport->release(vsk);
+               else if (sk->sk_type == SOCK_STREAM)
+                       vsock_remove_sock(vsk);
+
                 sock_orphan(sk);
                 sk->sk_shutdown = SHUTDOWN_MASK;
  
diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c

index 3492c021925f4b2163ff5aff46d334ae64bbdc41..630b851f8150fd2e67eac5187d75a1771b82f068 100644 (file)
--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -526,12 +526,9 @@ static bool hvs_close_lock_held(struct vsock_sock *vsk)
  
  static void hvs_release(struct vsock_sock *vsk)
  {
-       struct sock *sk = sk_vsock(vsk);
         bool remove_sock;
  
-       lock_sock_nested(sk, SINGLE_DEPTH_NESTING);
         remove_sock = hvs_close_lock_held(vsk);
-       release_sock(sk);
         if (remove_sock)
                 vsock_remove_sock(vsk);
  }
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c

index d9f0c9c5425a424c2c54e7794cd97764a9735ee1..f3c4bab2f737c97c08e38237b8108209f45eccad 100644 (file)
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -829,7 +829,6 @@ void virtio_transport_release(struct vsock_sock *vsk)
         struct sock *sk = &vsk->sk;
         bool remove_sock = true;
  
-       lock_sock_nested(sk, SINGLE_DEPTH_NESTING);
         if (sk->sk_type == SOCK_STREAM)
                 remove_sock = virtio_transport_close(vsk);
  
@@ -837,7 +836,6 @@ void virtio_transport_release(struct vsock_sock *vsk)
                 list_del(&pkt->list);
                 virtio_transport_free_pkt(pkt);
         }
-       release_sock(sk);
  
         if (remove_sock)
                 vsock_remove_sock(vsk);
diff --git a/net/wireless/ethtool.c b/net/wireless/ethtool.c

index a9c0f368db5d27ed72159a2395a4a4ec60234ed0..24e18405cdb48fff6090831f656b4f883c926109 100644 (file)
--- a/net/wireless/ethtool.c
+++ b/net/wireless/ethtool.c
@@ -7,9 +7,13 @@
  void cfg80211_get_drvinfo(struct net_device *dev, struct ethtool_drvinfo *info)
  {
         struct wireless_dev *wdev = dev->ieee80211_ptr;
+       struct device *pdev = wiphy_dev(wdev->wiphy);
  
-       strlcpy(info->driver, wiphy_dev(wdev->wiphy)->driver->name,
-               sizeof(info->driver));
+       if (pdev->driver)
+               strlcpy(info->driver, pdev->driver->name,
+                       sizeof(info->driver));
+       else
+               strlcpy(info->driver, "N/A", sizeof(info->driver));
  
         strlcpy(info->version, init_utsname()->release, sizeof(info->version));
  
diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c

index 123b8d720a596d31cf7abff3b450cc1c64ee8441..5b19e9fac4aac68e9ce222610fa8a75f9b1a9c29 100644 (file)
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -20,6 +20,7 @@
  #include <linux/netlink.h>
  #include <linux/nospec.h>
  #include <linux/etherdevice.h>
+#include <linux/if_vlan.h>
  #include <net/net_namespace.h>
  #include <net/genetlink.h>
  #include <net/cfg80211.h>
@@ -437,6 +438,7 @@ const struct nla_policy nl80211_policy[NUM_NL80211_ATTR] = {
         [NL80211_ATTR_CONTROL_PORT_NO_ENCRYPT] = { .type = NLA_FLAG },
         [NL80211_ATTR_CONTROL_PORT_OVER_NL80211] = { .type = NLA_FLAG },
         [NL80211_ATTR_PRIVACY] = { .type = NLA_FLAG },
+       [NL80211_ATTR_STATUS_CODE] = { .type = NLA_U16 },
         [NL80211_ATTR_CIPHER_SUITE_GROUP] = { .type = NLA_U32 },
         [NL80211_ATTR_WPA_VERSIONS] = { .type = NLA_U32 },
         [NL80211_ATTR_PID] = { .type = NLA_U32 },
@@ -4799,8 +4801,7 @@ static int nl80211_start_ap(struct sk_buff *skb, struct genl_info *info)
                 err = nl80211_parse_he_obss_pd(
                                         info->attrs[NL80211_ATTR_HE_OBSS_PD],
                                         &params.he_obss_pd);
-               if (err)
-                       return err;
+               goto out;
         }
  
         nl80211_calculate_ap_params(&params);
@@ -4822,6 +4823,7 @@ static int nl80211_start_ap(struct sk_buff *skb, struct genl_info *info)
         }
         wdev_unlock(wdev);
  
+out:
         kfree(params.acl);
  
         return err;
diff --git a/net/wireless/reg.c b/net/wireless/reg.c

index fff9a74891fc433e5ac05970b23aa7f13adaad78..1a8218f1bbe075457441e39da8e14220e375701d 100644 (file)
--- a/net/wireless/reg.c
+++ b/net/wireless/reg.c
@@ -2276,7 +2276,7 @@ static void handle_channel_custom(struct wiphy *wiphy,
                         break;
         }
  
-       if (IS_ERR(reg_rule)) {
+       if (IS_ERR_OR_NULL(reg_rule)) {
                 pr_debug("Disabling freq %d MHz as custom regd has no rule that fits it\n",
                          chan->center_freq);
                 if (wiphy->regulatory_flags & REGULATORY_WIPHY_SELF_MANAGED) {
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c

index df600487a68d513c3f7abcb1b584d2f4c229a7dd..356f90e4522b4cc39bf05ae4765d0517c43e5dc4 100644 (file)
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -217,6 +217,7 @@ static int xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
  static void xsk_flush(struct xdp_sock *xs)
  {
         xskq_prod_submit(xs->rx);
+       __xskq_cons_release(xs->umem->fq);
         sock_def_readable(&xs->sk);
  }
  
@@ -304,6 +305,7 @@ void xsk_umem_consume_tx_done(struct xdp_umem *umem)
  
         rcu_read_lock();
         list_for_each_entry_rcu(xs, &umem->xsk_list, list) {
+               __xskq_cons_release(xs->tx);
                 xs->sk.sk_write_space(&xs->sk);
         }
         rcu_read_unlock();
diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h

index bec2af11853addb48c658b7e046383ef8025169f..89a01ac4e07965cfafc342d205fbf21edc366c0b 100644 (file)
--- a/net/xdp/xsk_queue.h
+++ b/net/xdp/xsk_queue.h
@@ -271,7 +271,8 @@ static inline void xskq_cons_release(struct xsk_queue *q)
  {
         /* To improve performance, only update local state here.
          * Reflect this to global state when we get new entries
-        * from the ring in xskq_cons_get_entries().
+        * from the ring in xskq_cons_get_entries() and whenever
+        * Rx or Tx processing are completed in the NAPI loop.
          */
         q->cached_cons++;
  }
diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c

index dc651a628dcf07df229ff85878c744a257f01f49..3361e3ac5714cc6c751afe3eed996f8956180696 100644 (file)
--- a/net/xfrm/xfrm_interface.c
+++ b/net/xfrm/xfrm_interface.c
@@ -300,10 +300,10 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
                         if (mtu < IPV6_MIN_MTU)
                                 mtu = IPV6_MIN_MTU;
  
-                       icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
+                       icmpv6_ndo_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
                 } else {
-                       icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
-                                 htonl(mtu));
+                       icmp_ndo_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
+                                     htonl(mtu));
                 }
  
                 dst_release(dst);
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib

index bae62549e3d2a3308b1683f375fd4d9004cbe6f2..752ff0a225a9d3d4caaf824ac7a2272743234451 100644 (file)
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -300,15 +300,15 @@ DT_BINDING_DIR := Documentation/devicetree/bindings
  DT_TMP_SCHEMA := $(objtree)/$(DT_BINDING_DIR)/processed-schema.yaml
  
  quiet_cmd_dtb_check =  CHECK   $@
-      cmd_dtb_check =  $(DT_CHECKER) -u $(srctree)/$(DT_BINDING_DIR) -p $(DT_TMP_SCHEMA) $@ ;
+      cmd_dtb_check =  $(DT_CHECKER) -u $(srctree)/$(DT_BINDING_DIR) -p $(DT_TMP_SCHEMA) $@
  
-define rule_dtc_dt_yaml
+define rule_dtc
         $(call cmd_and_fixdep,dtc,yaml)
         $(call cmd,dtb_check)
  endef
  
  $(obj)/%.dt.yaml: $(src)/%.dts $(DTC) $(DT_TMP_SCHEMA) FORCE
-       $(call if_changed_rule,dtc_dt_yaml)
+       $(call if_changed_rule,dtc)
  
  dtc-tmp = $(subst $(comma),_,$(dot-target).dts.tmp)
  
diff --git a/scripts/get_maintainer.pl b/scripts/get_maintainer.pl

index 34085d146fa2cadb0a92353c0f6cc0d89303af29..6cbcd1a3e113c1f5866fc02ae4f8b40f16276be0 100755 (executable)
--- a/scripts/get_maintainer.pl
+++ b/scripts/get_maintainer.pl
@@ -932,10 +932,6 @@ sub get_maintainers {
         }
      }
  
-    foreach my $fix (@fixes) {
-       vcs_add_commit_signers($fix, "blamed_fixes");
-    }
-
      foreach my $email (@email_to, @list_to) {
         $email->[0] = deduplicate_email($email->[0]);
      }
@@ -974,6 +970,10 @@ sub get_maintainers {
         }
      }
  
+    foreach my $fix (@fixes) {
+       vcs_add_commit_signers($fix, "blamed_fixes");
+    }
+
      my @to = ();
      if ($email || $email_list) {
         if ($email) {
@@ -1341,35 +1341,11 @@ sub add_categories {
                     }
                 }
             } elsif ($ptype eq "M") {
-               my ($name, $address) = parse_email($pvalue);
-               if ($name eq "") {
-                   if ($i > 0) {
-                       my $tv = $typevalue[$i - 1];
-                       if ($tv =~ m/^([A-Z]):\s*(.*)/) {
-                           if ($1 eq "P") {
-                               $name = $2;
-                               $pvalue = format_email($name, $address, $email_usename);
-                           }
-                       }
-                   }
-               }
                 if ($email_maintainer) {
                     my $role = get_maintainer_role($i);
                     push_email_addresses($pvalue, $role);
                 }
             } elsif ($ptype eq "R") {
-               my ($name, $address) = parse_email($pvalue);
-               if ($name eq "") {
-                   if ($i > 0) {
-                       my $tv = $typevalue[$i - 1];
-                       if ($tv =~ m/^([A-Z]):\s*(.*)/) {
-                           if ($1 eq "P") {
-                               $name = $2;
-                               $pvalue = format_email($name, $address, $email_usename);
-                           }
-                       }
-                   }
-               }
                 if ($email_reviewer) {
                     my $subsystem = get_subsystem_name($i);
                     push_email_addresses($pvalue, "reviewer:$subsystem");
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c

index a566d8201b56c530d93cee048d9a45a6a3ab26a9..0133dfaaf3529c83c542c06756361224f6d464ca 100644 (file)
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -210,7 +210,7 @@ static struct sym_entry *read_symbol(FILE *in)
  
         len = strlen(name) + 1;
  
-       sym = malloc(sizeof(*sym) + len);
+       sym = malloc(sizeof(*sym) + len + 1);
         if (!sym) {
                 fprintf(stderr, "kallsyms failure: "
                         "unable to allocate required amount of memory\n");
@@ -219,7 +219,7 @@ static struct sym_entry *read_symbol(FILE *in)
         sym->addr = addr;
         sym->len = len;
         sym->sym[0] = type;
-       memcpy(sym_name(sym), name, len);
+       strcpy(sym_name(sym), name);
         sym->percpu_absolute = 0;
  
         return sym;
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh

index 1919c311c1491af4027ebdccc3b998308da4043f..dd484e92752edf3509e2e8b74ff0e395e46e4470 100755 (executable)
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -239,7 +239,7 @@ else
  fi;
  
  # final build of init/
-${MAKE} -f "${srctree}/scripts/Makefile.build" obj=init
+${MAKE} -f "${srctree}/scripts/Makefile.build" obj=init need-builtin=1
  
  #link vmlinux.o
  info LD vmlinux.o
diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig

index 711ff10fa36eca6ad201deca04abc41ff2858339..3f3ee4e2eb0d1c739ca57f93ad1dd7fe421157ff 100644 (file)
--- a/security/integrity/ima/Kconfig
+++ b/security/integrity/ima/Kconfig
@@ -112,6 +112,10 @@ choice
         config IMA_DEFAULT_HASH_WP512
                 bool "WP512"
                 depends on CRYPTO_WP512=y && !IMA_TEMPLATE
+
+       config IMA_DEFAULT_HASH_SM3
+               bool "SM3"
+               depends on CRYPTO_SM3=y && !IMA_TEMPLATE
  endchoice
  
  config IMA_DEFAULT_HASH
@@ -121,6 +125,7 @@ config IMA_DEFAULT_HASH
         default "sha256" if IMA_DEFAULT_HASH_SHA256
         default "sha512" if IMA_DEFAULT_HASH_SHA512
         default "wp512" if IMA_DEFAULT_HASH_WP512
+       default "sm3" if IMA_DEFAULT_HASH_SM3
  
  config IMA_WRITE_POLICY
         bool "Enable multiple writes to the IMA policy"
diff --git a/security/integrity/platform_certs/load_uefi.c b/security/integrity/platform_certs/load_uefi.c

index 111898aad56e48b99d93d55411adedbc82ad8670..f0c908241966ae3f387a55d4ddc813150135b91b 100644 (file)
--- a/security/integrity/platform_certs/load_uefi.c
+++ b/security/integrity/platform_certs/load_uefi.c
@@ -35,16 +35,18 @@ static __init bool uefi_check_ignore_db(void)
   * Get a certificate list blob from the named EFI variable.
   */
  static __init void *get_cert_list(efi_char16_t *name, efi_guid_t *guid,
-                                 unsigned long *size)
+                                 unsigned long *size, efi_status_t *status)
  {
-       efi_status_t status;
         unsigned long lsize = 4;
         unsigned long tmpdb[4];
         void *db;
  
-       status = efi.get_variable(name, guid, NULL, &lsize, &tmpdb);
-       if (status != EFI_BUFFER_TOO_SMALL) {
-               pr_err("Couldn't get size: 0x%lx\n", status);
+       *status = efi.get_variable(name, guid, NULL, &lsize, &tmpdb);
+       if (*status == EFI_NOT_FOUND)
+               return NULL;
+
+       if (*status != EFI_BUFFER_TOO_SMALL) {
+               pr_err("Couldn't get size: 0x%lx\n", *status);
                 return NULL;
         }
  
@@ -52,10 +54,10 @@ static __init void *get_cert_list(efi_char16_t *name, efi_guid_t *guid,
         if (!db)
                 return NULL;
  
-       status = efi.get_variable(name, guid, NULL, &lsize, db);
-       if (status != EFI_SUCCESS) {
+       *status = efi.get_variable(name, guid, NULL, &lsize, db);
+       if (*status != EFI_SUCCESS) {
                 kfree(db);
-               pr_err("Error reading db var: 0x%lx\n", status);
+               pr_err("Error reading db var: 0x%lx\n", *status);
                 return NULL;
         }
  
@@ -74,6 +76,7 @@ static int __init load_uefi_certs(void)
         efi_guid_t mok_var = EFI_SHIM_LOCK_GUID;
         void *db = NULL, *dbx = NULL, *mok = NULL;
         unsigned long dbsize = 0, dbxsize = 0, moksize = 0;
+       efi_status_t status;
         int rc = 0;
  
         if (!efi.get_variable)
@@ -83,9 +86,12 @@ static int __init load_uefi_certs(void)
          * an error if we can't get them.
          */
         if (!uefi_check_ignore_db()) {
-               db = get_cert_list(L"db", &secure_var, &dbsize);
+               db = get_cert_list(L"db", &secure_var, &dbsize, &status);
                 if (!db) {
-                       pr_err("MODSIGN: Couldn't get UEFI db list\n");
+                       if (status == EFI_NOT_FOUND)
+                               pr_debug("MODSIGN: db variable wasn't found\n");
+                       else
+                               pr_err("MODSIGN: Couldn't get UEFI db list\n");
                 } else {
                         rc = parse_efi_signature_list("UEFI:db",
                                         db, dbsize, get_handler_for_db);
@@ -96,9 +102,12 @@ static int __init load_uefi_certs(void)
                 }
         }
  
-       mok = get_cert_list(L"MokListRT", &mok_var, &moksize);
+       mok = get_cert_list(L"MokListRT", &mok_var, &moksize, &status);
         if (!mok) {
-               pr_info("Couldn't get UEFI MokListRT\n");
+               if (status == EFI_NOT_FOUND)
+                       pr_debug("MokListRT variable wasn't found\n");
+               else
+                       pr_info("Couldn't get UEFI MokListRT\n");
         } else {
                 rc = parse_efi_signature_list("UEFI:MokListRT",
                                               mok, moksize, get_handler_for_db);
@@ -107,9 +116,12 @@ static int __init load_uefi_certs(void)
                 kfree(mok);
         }
  
-       dbx = get_cert_list(L"dbx", &secure_var, &dbxsize);
+       dbx = get_cert_list(L"dbx", &secure_var, &dbxsize, &status);
         if (!dbx) {
-               pr_info("Couldn't get UEFI dbx list\n");
+               if (status == EFI_NOT_FOUND)
+                       pr_debug("dbx variable wasn't found\n");
+               else
+                       pr_info("Couldn't get UEFI dbx list\n");
         } else {
                 rc = parse_efi_signature_list("UEFI:dbx",
                                               dbx, dbxsize,
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c

index 336406bcb59e2cd3ded45db70449d390deac2850..d5443eeb8b6338a2a51d7a1c158e4d87c0184e16 100644 (file)
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -2594,7 +2594,8 @@ void snd_pcm_release_substream(struct snd_pcm_substream *substream)
  
         snd_pcm_drop(substream);
         if (substream->hw_opened) {
-               do_hw_free(substream);
+               if (substream->runtime->status->state != SNDRV_PCM_STATE_OPEN)
+                       do_hw_free(substream);
                 substream->ops->close(substream);
                 substream->hw_opened = 0;
         }
diff --git a/sound/core/seq/seq_clientmgr.c b/sound/core/seq/seq_clientmgr.c

index 6d9592f0ae1d53c42b3d7d59bfdc462f39a34f77..cc93157fa950045b8d74e2fc4a6b1dc4e7753a5d 100644 (file)
--- a/sound/core/seq/seq_clientmgr.c
+++ b/sound/core/seq/seq_clientmgr.c
@@ -580,7 +580,7 @@ static int update_timestamp_of_queue(struct snd_seq_event *event,
         event->queue = queue;
         event->flags &= ~SNDRV_SEQ_TIME_STAMP_MASK;
         if (real_time) {
-               event->time.time = snd_seq_timer_get_cur_time(q->timer);
+               event->time.time = snd_seq_timer_get_cur_time(q->timer, true);
                 event->flags |= SNDRV_SEQ_TIME_STAMP_REAL;
         } else {
                 event->time.tick = snd_seq_timer_get_cur_tick(q->timer);
@@ -1659,7 +1659,7 @@ static int snd_seq_ioctl_get_queue_status(struct snd_seq_client *client,
         tmr = queue->timer;
         status->events = queue->tickq->cells + queue->timeq->cells;
  
-       status->time = snd_seq_timer_get_cur_time(tmr);
+       status->time = snd_seq_timer_get_cur_time(tmr, true);
         status->tick = snd_seq_timer_get_cur_tick(tmr);
  
         status->running = tmr->running;
diff --git a/sound/core/seq/seq_queue.c b/sound/core/seq/seq_queue.c

index caf68bf42f134be3e802c976b6b824f38d64c82d..71a6ea62c3be7c8a0a1885da0b32b2b527ef8644 100644 (file)
--- a/sound/core/seq/seq_queue.c
+++ b/sound/core/seq/seq_queue.c
@@ -238,6 +238,8 @@ void snd_seq_check_queue(struct snd_seq_queue *q, int atomic, int hop)
  {
         unsigned long flags;
         struct snd_seq_event_cell *cell;
+       snd_seq_tick_time_t cur_tick;
+       snd_seq_real_time_t cur_time;
  
         if (q == NULL)
                 return;
@@ -254,17 +256,18 @@ void snd_seq_check_queue(struct snd_seq_queue *q, int atomic, int hop)
  
        __again:
         /* Process tick queue... */
+       cur_tick = snd_seq_timer_get_cur_tick(q->timer);
         for (;;) {
-               cell = snd_seq_prioq_cell_out(q->tickq,
-                                             &q->timer->tick.cur_tick);
+               cell = snd_seq_prioq_cell_out(q->tickq, &cur_tick);
                 if (!cell)
                         break;
                 snd_seq_dispatch_event(cell, atomic, hop);
         }
  
         /* Process time queue... */
+       cur_time = snd_seq_timer_get_cur_time(q->timer, false);
         for (;;) {
-               cell = snd_seq_prioq_cell_out(q->timeq, &q->timer->cur_time);
+               cell = snd_seq_prioq_cell_out(q->timeq, &cur_time);
                 if (!cell)
                         break;
                 snd_seq_dispatch_event(cell, atomic, hop);
@@ -392,6 +395,7 @@ int snd_seq_queue_check_access(int queueid, int client)
  int snd_seq_queue_set_owner(int queueid, int client, int locked)
  {
         struct snd_seq_queue *q = queueptr(queueid);
+       unsigned long flags;
  
         if (q == NULL)
                 return -EINVAL;
@@ -401,8 +405,10 @@ int snd_seq_queue_set_owner(int queueid, int client, int locked)
                 return -EPERM;
         }
  
+       spin_lock_irqsave(&q->owner_lock, flags);
         q->locked = locked ? 1 : 0;
         q->owner = client;
+       spin_unlock_irqrestore(&q->owner_lock, flags);
         queue_access_unlock(q);
         queuefree(q);
  
@@ -539,15 +545,17 @@ void snd_seq_queue_client_termination(int client)
         unsigned long flags;
         int i;
         struct snd_seq_queue *q;
+       bool matched;
  
         for (i = 0; i < SNDRV_SEQ_MAX_QUEUES; i++) {
                 if ((q = queueptr(i)) == NULL)
                         continue;
                 spin_lock_irqsave(&q->owner_lock, flags);
-               if (q->owner == client)
+               matched = (q->owner == client);
+               if (matched)
                         q->klocked = 1;
                 spin_unlock_irqrestore(&q->owner_lock, flags);
-               if (q->owner == client) {
+               if (matched) {
                         if (q->timer->running)
                                 snd_seq_timer_stop(q->timer);
                         snd_seq_timer_reset(q->timer);
@@ -739,6 +747,8 @@ void snd_seq_info_queues_read(struct snd_info_entry *entry,
         int i, bpm;
         struct snd_seq_queue *q;
         struct snd_seq_timer *tmr;
+       bool locked;
+       int owner;
  
         for (i = 0; i < SNDRV_SEQ_MAX_QUEUES; i++) {
                 if ((q = queueptr(i)) == NULL)
@@ -750,9 +760,14 @@ void snd_seq_info_queues_read(struct snd_info_entry *entry,
                 else
                         bpm = 0;
  
+               spin_lock_irq(&q->owner_lock);
+               locked = q->locked;
+               owner = q->owner;
+               spin_unlock_irq(&q->owner_lock);
+
                 snd_iprintf(buffer, "queue %d: [%s]\n", q->queue, q->name);
-               snd_iprintf(buffer, "owned by client    : %d\n", q->owner);
-               snd_iprintf(buffer, "lock status        : %s\n", q->locked ? "Locked" : "Free");
+               snd_iprintf(buffer, "owned by client    : %d\n", owner);
+               snd_iprintf(buffer, "lock status        : %s\n", locked ? "Locked" : "Free");
                 snd_iprintf(buffer, "queued time events : %d\n", snd_seq_prioq_avail(q->timeq));
                 snd_iprintf(buffer, "queued tick events : %d\n", snd_seq_prioq_avail(q->tickq));
                 snd_iprintf(buffer, "timer state        : %s\n", tmr->running ? "Running" : "Stopped");
diff --git a/sound/core/seq/seq_timer.c b/sound/core/seq/seq_timer.c

index be59b59c9be40c233995817f3ddb3e5d71fbb6d7..1645e4142e30246dc30af69857d1dc8e9f736219 100644 (file)
--- a/sound/core/seq/seq_timer.c
+++ b/sound/core/seq/seq_timer.c
@@ -428,14 +428,15 @@ int snd_seq_timer_continue(struct snd_seq_timer *tmr)
  }
  
  /* return current 'real' time. use timeofday() to get better granularity. */
-snd_seq_real_time_t snd_seq_timer_get_cur_time(struct snd_seq_timer *tmr)
+snd_seq_real_time_t snd_seq_timer_get_cur_time(struct snd_seq_timer *tmr,
+                                              bool adjust_ktime)
  {
         snd_seq_real_time_t cur_time;
         unsigned long flags;
  
         spin_lock_irqsave(&tmr->lock, flags);
         cur_time = tmr->cur_time;
-       if (tmr->running) { 
+       if (adjust_ktime && tmr->running) {
                 struct timespec64 tm;
  
                 ktime_get_ts64(&tm);
@@ -452,7 +453,13 @@ snd_seq_real_time_t snd_seq_timer_get_cur_time(struct snd_seq_timer *tmr)
   high PPQ values) */
  snd_seq_tick_time_t snd_seq_timer_get_cur_tick(struct snd_seq_timer *tmr)
  {
-       return tmr->tick.cur_tick;
+       snd_seq_tick_time_t cur_tick;
+       unsigned long flags;
+
+       spin_lock_irqsave(&tmr->lock, flags);
+       cur_tick = tmr->tick.cur_tick;
+       spin_unlock_irqrestore(&tmr->lock, flags);
+       return cur_tick;
  }
  
  
diff --git a/sound/core/seq/seq_timer.h b/sound/core/seq/seq_timer.h

index 66c3e344eae37f9285e90ab3b8d607f6d1f4bb02..4bec57df8158caf81e6cdba3248d7e252bd34643 100644 (file)
--- a/sound/core/seq/seq_timer.h
+++ b/sound/core/seq/seq_timer.h
@@ -120,7 +120,8 @@ int snd_seq_timer_set_tempo_ppq(struct snd_seq_timer *tmr, int tempo, int ppq);
  int snd_seq_timer_set_position_tick(struct snd_seq_timer *tmr, snd_seq_tick_time_t position);
  int snd_seq_timer_set_position_time(struct snd_seq_timer *tmr, snd_seq_real_time_t position);
  int snd_seq_timer_set_skew(struct snd_seq_timer *tmr, unsigned int skew, unsigned int base);
-snd_seq_real_time_t snd_seq_timer_get_cur_time(struct snd_seq_timer *tmr);
+snd_seq_real_time_t snd_seq_timer_get_cur_time(struct snd_seq_timer *tmr,
+                                              bool adjust_ktime);
  snd_seq_tick_time_t snd_seq_timer_get_cur_tick(struct snd_seq_timer *tmr);
  
  extern int seq_default_timer_class;
diff --git a/sound/hda/ext/hdac_ext_controller.c b/sound/hda/ext/hdac_ext_controller.c

index a684f0520b4b982f71897967641d05ce0dca5128..4d060d5b1db6dc7eabfd5d44c3ef965792925537 100644 (file)
--- a/sound/hda/ext/hdac_ext_controller.c
+++ b/sound/hda/ext/hdac_ext_controller.c
@@ -254,6 +254,7 @@ EXPORT_SYMBOL_GPL(snd_hdac_ext_bus_link_power_down_all);
  int snd_hdac_ext_bus_link_get(struct hdac_bus *bus,
                                 struct hdac_ext_link *link)
  {
+       unsigned long codec_mask;
         int ret = 0;
  
         mutex_lock(&bus->lock);
@@ -280,9 +281,11 @@ int snd_hdac_ext_bus_link_get(struct hdac_bus *bus,
                  *  HDA spec section 4.3 - Codec Discovery
                  */
                 udelay(521);
-               bus->codec_mask = snd_hdac_chip_readw(bus, STATESTS);
-               dev_dbg(bus->dev, "codec_mask = 0x%lx\n", bus->codec_mask);
-               snd_hdac_chip_writew(bus, STATESTS, bus->codec_mask);
+               codec_mask = snd_hdac_chip_readw(bus, STATESTS);
+               dev_dbg(bus->dev, "codec_mask = 0x%lx\n", codec_mask);
+               snd_hdac_chip_writew(bus, STATESTS, codec_mask);
+               if (!bus->codec_mask)
+                       bus->codec_mask = codec_mask;
         }
  
         mutex_unlock(&bus->lock);
diff --git a/sound/hda/hdmi_chmap.c b/sound/hda/hdmi_chmap.c

index 5fd6d575e123b0e4cb20a986ecf8010d1795023c..aad5c4bf4d3441754b0ec1d5d046b438b1ee6c5e 100644 (file)
--- a/sound/hda/hdmi_chmap.c
+++ b/sound/hda/hdmi_chmap.c
@@ -250,7 +250,7 @@ void snd_hdac_print_channel_allocation(int spk_alloc, char *buf, int buflen)
  
         for (i = 0, j = 0; i < ARRAY_SIZE(cea_speaker_allocation_names); i++) {
                 if (spk_alloc & (1 << i))
-                       j += snprintf(buf + j, buflen - j,  " %s",
+                       j += scnprintf(buf + j, buflen - j,  " %s",
                                         cea_speaker_allocation_names[i]);
         }
         buf[j] = '\0';  /* necessary when j == 0 */
diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c

index 5dc42f932739f3b385f826cb4431e0ed85731a0f..53e7732ef75209b63617ecc700c55cf994064bf8 100644 (file)
--- a/sound/pci/hda/hda_codec.c
+++ b/sound/pci/hda/hda_codec.c
@@ -4022,7 +4022,7 @@ void snd_print_pcm_bits(int pcm, char *buf, int buflen)
  
         for (i = 0, j = 0; i < ARRAY_SIZE(bits); i++)
                 if (pcm & (AC_SUPPCM_BITS_8 << i))
-                       j += snprintf(buf + j, buflen - j,  " %d", bits[i]);
+                       j += scnprintf(buf + j, buflen - j,  " %d", bits[i]);
  
         buf[j] = '\0'; /* necessary when j == 0 */
  }
diff --git a/sound/pci/hda/hda_eld.c b/sound/pci/hda/hda_eld.c

index bb46c89b7f63d9c06e8500065170bbabbc8205e7..136477ed46ae2673db07fcd200b61871063dab07 100644 (file)
--- a/sound/pci/hda/hda_eld.c
+++ b/sound/pci/hda/hda_eld.c
@@ -360,7 +360,7 @@ static void hdmi_print_pcm_rates(int pcm, char *buf, int buflen)
  
         for (i = 0, j = 0; i < ARRAY_SIZE(alsa_rates); i++)
                 if (pcm & (1 << i))
-                       j += snprintf(buf + j, buflen - j,  " %d",
+                       j += scnprintf(buf + j, buflen - j,  " %d",
                                 alsa_rates[i]);
  
         buf[j] = '\0'; /* necessary when j == 0 */
diff --git a/sound/pci/hda/hda_sysfs.c b/sound/pci/hda/hda_sysfs.c

index 0607ed5d1959958da12647ba6b4b165c5683d4da..eb8ec109d7adb5ce187567b2a5a8c10bdc3eec05 100644 (file)
--- a/sound/pci/hda/hda_sysfs.c
+++ b/sound/pci/hda/hda_sysfs.c
@@ -222,7 +222,7 @@ static ssize_t init_verbs_show(struct device *dev,
         int i, len = 0;
         mutex_lock(&codec->user_mutex);
         snd_array_for_each(&codec->init_verbs, i, v) {
-               len += snprintf(buf + len, PAGE_SIZE - len,
+               len += scnprintf(buf + len, PAGE_SIZE - len,
                                 "0x%02x 0x%03x 0x%04x\n",
                                 v->nid, v->verb, v->param);
         }
@@ -272,7 +272,7 @@ static ssize_t hints_show(struct device *dev,
         int i, len = 0;
         mutex_lock(&codec->user_mutex);
         snd_array_for_each(&codec->hints, i, hint) {
-               len += snprintf(buf + len, PAGE_SIZE - len,
+               len += scnprintf(buf + len, PAGE_SIZE - len,
                                 "%s = %s\n", hint->key, hint->val);
         }
         mutex_unlock(&codec->user_mutex);
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c

index 4770fb3f51fb4c641dd5cdab254420c733154174..477589e7ec1d90b8346542c3dd8cf8bd57b9063f 100644 (file)
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -2447,6 +2447,9 @@ static const struct snd_pci_quirk alc882_fixup_tbl[] = {
         SND_PCI_QUIRK(0x1071, 0x8258, "Evesham Voyaeger", ALC882_FIXUP_EAPD),
         SND_PCI_QUIRK(0x1458, 0xa002, "Gigabyte EP45-DS3/Z87X-UD3H", ALC889_FIXUP_FRONT_HP_NO_PRESENCE),
         SND_PCI_QUIRK(0x1458, 0xa0b8, "Gigabyte AZ370-Gaming", ALC1220_FIXUP_GB_DUAL_CODECS),
+       SND_PCI_QUIRK(0x1462, 0x1228, "MSI-GP63", ALC1220_FIXUP_CLEVO_P950),
+       SND_PCI_QUIRK(0x1462, 0x1276, "MSI-GL73", ALC1220_FIXUP_CLEVO_P950),
+       SND_PCI_QUIRK(0x1462, 0x1293, "MSI-GP65", ALC1220_FIXUP_CLEVO_P950),
         SND_PCI_QUIRK(0x1462, 0x7350, "MSI-7350", ALC889_FIXUP_CD),
         SND_PCI_QUIRK(0x1462, 0xda57, "MSI Z270-Gaming", ALC1220_FIXUP_GB_DUAL_CODECS),
         SND_PCI_QUIRK_VENDOR(0x1462, "MSI", ALC882_FIXUP_GPIO3),
@@ -5701,8 +5704,11 @@ static void alc_fixup_headset_jack(struct hda_codec *codec,
                 break;
         case HDA_FIXUP_ACT_INIT:
                 switch (codec->core.vendor_id) {
+               case 0x10ec0215:
                 case 0x10ec0225:
+               case 0x10ec0285:
                 case 0x10ec0295:
+               case 0x10ec0289:
                 case 0x10ec0299:
                         alc_write_coef_idx(codec, 0x48, 0xd011);
                         alc_update_coef_idx(codec, 0x49, 0x007f, 0x0045);
diff --git a/sound/soc/amd/raven/acp3x-i2s.c b/sound/soc/amd/raven/acp3x-i2s.c

index 31cd4008e33ff88d9b61085133d12755a60f1724..91a388184e525d66f0953effc36f0b6204a04972 100644 (file)
--- a/sound/soc/amd/raven/acp3x-i2s.c
+++ b/sound/soc/amd/raven/acp3x-i2s.c
@@ -170,6 +170,7 @@ static int acp3x_i2s_trigger(struct snd_pcm_substream *substream,
         struct snd_soc_card *card;
         struct acp3x_platform_info *pinfo;
         u32 ret, val, period_bytes, reg_val, ier_val, water_val;
+       u32 buf_size, buf_reg;
  
         prtd = substream->private_data;
         rtd = substream->runtime->private_data;
@@ -183,6 +184,8 @@ static int acp3x_i2s_trigger(struct snd_pcm_substream *substream,
         }
         period_bytes = frames_to_bytes(substream->runtime,
                         substream->runtime->period_size);
+       buf_size = frames_to_bytes(substream->runtime,
+                       substream->runtime->buffer_size);
         switch (cmd) {
         case SNDRV_PCM_TRIGGER_START:
         case SNDRV_PCM_TRIGGER_RESUME:
@@ -196,6 +199,7 @@ static int acp3x_i2s_trigger(struct snd_pcm_substream *substream,
                                         mmACP_BT_TX_INTR_WATERMARK_SIZE;
                                 reg_val = mmACP_BTTDM_ITER;
                                 ier_val = mmACP_BTTDM_IER;
+                               buf_reg = mmACP_BT_TX_RINGBUFSIZE;
                                 break;
                         case I2S_SP_INSTANCE:
                         default:
@@ -203,6 +207,7 @@ static int acp3x_i2s_trigger(struct snd_pcm_substream *substream,
                                         mmACP_I2S_TX_INTR_WATERMARK_SIZE;
                                 reg_val = mmACP_I2STDM_ITER;
                                 ier_val = mmACP_I2STDM_IER;
+                               buf_reg = mmACP_I2S_TX_RINGBUFSIZE;
                         }
                 } else {
                         switch (rtd->i2s_instance) {
@@ -211,6 +216,7 @@ static int acp3x_i2s_trigger(struct snd_pcm_substream *substream,
                                         mmACP_BT_RX_INTR_WATERMARK_SIZE;
                                 reg_val = mmACP_BTTDM_IRER;
                                 ier_val = mmACP_BTTDM_IER;
+                               buf_reg = mmACP_BT_RX_RINGBUFSIZE;
                                 break;
                         case I2S_SP_INSTANCE:
                         default:
@@ -218,9 +224,11 @@ static int acp3x_i2s_trigger(struct snd_pcm_substream *substream,
                                         mmACP_I2S_RX_INTR_WATERMARK_SIZE;
                                 reg_val = mmACP_I2STDM_IRER;
                                 ier_val = mmACP_I2STDM_IER;
+                               buf_reg = mmACP_I2S_RX_RINGBUFSIZE;
                         }
                 }
                 rv_writel(period_bytes, rtd->acp3x_base + water_val);
+               rv_writel(buf_size, rtd->acp3x_base + buf_reg);
                 val = rv_readl(rtd->acp3x_base + reg_val);
                 val = val | BIT(0);
                 rv_writel(val, rtd->acp3x_base + reg_val);
diff --git a/sound/soc/amd/raven/acp3x-pcm-dma.c b/sound/soc/amd/raven/acp3x-pcm-dma.c

index aecc3c06167907a00634559a25dea93c04fa0f8a..d62c0d90c41e34d99421f840312c8c77c04c793b 100644 (file)
--- a/sound/soc/amd/raven/acp3x-pcm-dma.c
+++ b/sound/soc/amd/raven/acp3x-pcm-dma.c
@@ -110,7 +110,7 @@ static void config_acp3x_dma(struct i2s_stream_instance *rtd, int direction)
  {
         u16 page_idx;
         u32 low, high, val, acp_fifo_addr, reg_fifo_addr;
-       u32 reg_ringbuf_size, reg_dma_size, reg_fifo_size;
+       u32 reg_dma_size, reg_fifo_size;
         dma_addr_t addr;
  
         addr = rtd->dma_addr;
@@ -157,7 +157,6 @@ static void config_acp3x_dma(struct i2s_stream_instance *rtd, int direction)
         if (direction == SNDRV_PCM_STREAM_PLAYBACK) {
                 switch (rtd->i2s_instance) {
                 case I2S_BT_INSTANCE:
-                       reg_ringbuf_size = mmACP_BT_TX_RINGBUFSIZE;
                         reg_dma_size = mmACP_BT_TX_DMA_SIZE;
                         acp_fifo_addr = ACP_SRAM_PTE_OFFSET +
                                                 BT_PB_FIFO_ADDR_OFFSET;
@@ -169,7 +168,6 @@ static void config_acp3x_dma(struct i2s_stream_instance *rtd, int direction)
  
                 case I2S_SP_INSTANCE:
                 default:
-                       reg_ringbuf_size = mmACP_I2S_TX_RINGBUFSIZE;
                         reg_dma_size = mmACP_I2S_TX_DMA_SIZE;
                         acp_fifo_addr = ACP_SRAM_PTE_OFFSET +
                                                 SP_PB_FIFO_ADDR_OFFSET;
@@ -181,7 +179,6 @@ static void config_acp3x_dma(struct i2s_stream_instance *rtd, int direction)
         } else {
                 switch (rtd->i2s_instance) {
                 case I2S_BT_INSTANCE:
-                       reg_ringbuf_size = mmACP_BT_RX_RINGBUFSIZE;
                         reg_dma_size = mmACP_BT_RX_DMA_SIZE;
                         acp_fifo_addr = ACP_SRAM_PTE_OFFSET +
                                                 BT_CAPT_FIFO_ADDR_OFFSET;
@@ -193,7 +190,6 @@ static void config_acp3x_dma(struct i2s_stream_instance *rtd, int direction)
  
                 case I2S_SP_INSTANCE:
                 default:
-                       reg_ringbuf_size = mmACP_I2S_RX_RINGBUFSIZE;
                         reg_dma_size = mmACP_I2S_RX_DMA_SIZE;
                         acp_fifo_addr = ACP_SRAM_PTE_OFFSET +
                                                 SP_CAPT_FIFO_ADDR_OFFSET;
@@ -203,7 +199,6 @@ static void config_acp3x_dma(struct i2s_stream_instance *rtd, int direction)
                                 rtd->acp3x_base + mmACP_I2S_RX_RINGBUFADDR);
                 }
         }
-       rv_writel(MAX_BUFFER, rtd->acp3x_base + reg_ringbuf_size);
         rv_writel(DMA_SIZE, rtd->acp3x_base + reg_dma_size);
         rv_writel(acp_fifo_addr, rtd->acp3x_base + reg_fifo_addr);
         rv_writel(FIFO_SIZE, rtd->acp3x_base + reg_fifo_size);
diff --git a/sound/soc/amd/raven/pci-acp3x.c b/sound/soc/amd/raven/pci-acp3x.c

index 65330bb50e74c991e8b0abed3d7e584a10785e15..da60e2ec5535172710229db1d461bff899a965b1 100644 (file)
--- a/sound/soc/amd/raven/pci-acp3x.c
+++ b/sound/soc/amd/raven/pci-acp3x.c
@@ -45,23 +45,6 @@ static int acp3x_power_on(void __iomem *acp3x_base)
         return -ETIMEDOUT;
  }
  
-static int acp3x_power_off(void __iomem *acp3x_base)
-{
-       u32 val;
-       int timeout;
-
-       rv_writel(ACP_PGFSM_CNTL_POWER_OFF_MASK,
-                       acp3x_base + mmACP_PGFSM_CONTROL);
-       timeout = 0;
-       while (++timeout < 500) {
-               val = rv_readl(acp3x_base + mmACP_PGFSM_STATUS);
-               if ((val & ACP_PGFSM_STATUS_MASK) == ACP_POWERED_OFF)
-                       return 0;
-               udelay(1);
-       }
-       return -ETIMEDOUT;
-}
-
  static int acp3x_reset(void __iomem *acp3x_base)
  {
         u32 val;
@@ -115,12 +98,6 @@ static int acp3x_deinit(void __iomem *acp3x_base)
                 pr_err("ACP3x reset failed\n");
                 return ret;
         }
-       /* power off */
-       ret = acp3x_power_off(acp3x_base);
-       if (ret) {
-               pr_err("ACP3x power off failed\n");
-               return ret;
-       }
         return 0;
  }
  
diff --git a/sound/soc/atmel/Kconfig b/sound/soc/atmel/Kconfig

index d1dc8e6366dcbd3356cb39dafb2cc41753bd4896..71f2d42188c461a4169af85cac84511851f122ac 100644 (file)
--- a/sound/soc/atmel/Kconfig
+++ b/sound/soc/atmel/Kconfig
@@ -10,11 +10,11 @@ config SND_ATMEL_SOC
  if SND_ATMEL_SOC
  
  config SND_ATMEL_SOC_PDC
-       tristate
+       bool
         depends on HAS_DMA
  
  config SND_ATMEL_SOC_DMA
-       tristate
+       bool
         select SND_SOC_GENERIC_DMAENGINE_PCM
  
  config SND_ATMEL_SOC_SSC
diff --git a/sound/soc/atmel/Makefile b/sound/soc/atmel/Makefile

index 1f6890ed373826fbad9432df6e6c14eebb2924b7..c7d2989791be1a2db9f4100246d8414ecea67d68 100644 (file)
--- a/sound/soc/atmel/Makefile
+++ b/sound/soc/atmel/Makefile
@@ -6,8 +6,14 @@ snd-soc-atmel_ssc_dai-objs := atmel_ssc_dai.o
  snd-soc-atmel-i2s-objs := atmel-i2s.o
  snd-soc-mchp-i2s-mcc-objs := mchp-i2s-mcc.o
  
-obj-$(CONFIG_SND_ATMEL_SOC_PDC) += snd-soc-atmel-pcm-pdc.o
-obj-$(CONFIG_SND_ATMEL_SOC_DMA) += snd-soc-atmel-pcm-dma.o
+# pdc and dma need to both be built-in if any user of
+# ssc is built-in.
+ifdef CONFIG_SND_ATMEL_SOC_PDC
+obj-$(CONFIG_SND_ATMEL_SOC_SSC) += snd-soc-atmel-pcm-pdc.o
+endif
+ifdef CONFIG_SND_ATMEL_SOC_DMA
+obj-$(CONFIG_SND_ATMEL_SOC_SSC) += snd-soc-atmel-pcm-dma.o
+endif
  obj-$(CONFIG_SND_ATMEL_SOC_SSC) += snd-soc-atmel_ssc_dai.o
  obj-$(CONFIG_SND_ATMEL_SOC_I2S) += snd-soc-atmel-i2s.o
  obj-$(CONFIG_SND_MCHP_SOC_I2S_MCC) += snd-soc-mchp-i2s-mcc.o
diff --git a/sound/soc/codecs/hdmi-codec.c b/sound/soc/codecs/hdmi-codec.c

index 444cc4e3374e17eb8b0dc988664d0d04914b3025..f005751da2ccb39abd42adb029c0f5fa0e4e3eac 100644 (file)
--- a/sound/soc/codecs/hdmi-codec.c
+++ b/sound/soc/codecs/hdmi-codec.c
@@ -779,7 +779,17 @@ static int hdmi_of_xlate_dai_id(struct snd_soc_component *component,
         return ret;
  }
  
+static void hdmi_remove(struct snd_soc_component *component)
+{
+       struct hdmi_codec_priv *hcp = snd_soc_component_get_drvdata(component);
+
+       if (hcp->hcd.ops->hook_plugged_cb)
+               hcp->hcd.ops->hook_plugged_cb(component->dev->parent,
+                                             hcp->hcd.data, NULL, NULL);
+}
+
  static const struct snd_soc_component_driver hdmi_driver = {
+       .remove                 = hdmi_remove,
         .dapm_widgets           = hdmi_widgets,
         .num_dapm_widgets       = ARRAY_SIZE(hdmi_widgets),
         .of_xlate_dai_id        = hdmi_of_xlate_dai_id,
diff --git a/sound/soc/codecs/max98090.c b/sound/soc/codecs/max98090.c

index 5bc2c6411b33bfbd4747a6b836375039bfdc1885..032adc14562d24c54b4933c0c4818f0537ebb1f4 100644 (file)
--- a/sound/soc/codecs/max98090.c
+++ b/sound/soc/codecs/max98090.c
@@ -5,150 +5,24 @@
   * Copyright 2011-2012 Maxim Integrated Products
   */
  
-#include <linux/acpi.h>
-#include <linux/clk.h>
  #include <linux/delay.h>
  #include <linux/i2c.h>
  #include <linux/module.h>
-#include <linux/mutex.h>
  #include <linux/of.h>
  #include <linux/pm.h>
  #include <linux/pm_runtime.h>
  #include <linux/regmap.h>
  #include <linux/slab.h>
+#include <linux/acpi.h>
+#include <linux/clk.h>
  #include <sound/jack.h>
-#include <sound/max98090.h>
  #include <sound/pcm.h>
  #include <sound/pcm_params.h>
  #include <sound/soc.h>
  #include <sound/tlv.h>
+#include <sound/max98090.h>
  #include "max98090.h"
  
-static void max98090_shdn_save_locked(struct max98090_priv *max98090)
-{
-       int shdn = 0;
-
-       /* saved_shdn, saved_count, SHDN are protected by card->dapm_mutex */
-       regmap_read(max98090->regmap, M98090_REG_DEVICE_SHUTDOWN, &shdn);
-       max98090->saved_shdn |= shdn;
-       ++max98090->saved_count;
-
-       if (shdn)
-               regmap_write(max98090->regmap, M98090_REG_DEVICE_SHUTDOWN, 0x0);
-}
-
-static void max98090_shdn_restore_locked(struct max98090_priv *max98090)
-{
-       /* saved_shdn, saved_count, SHDN are protected by card->dapm_mutex */
-       if (--max98090->saved_count == 0) {
-               if (max98090->saved_shdn) {
-                       regmap_write(max98090->regmap,
-                                    M98090_REG_DEVICE_SHUTDOWN,
-                                    M98090_SHDNN_MASK);
-                       max98090->saved_shdn = 0;
-               }
-       }
-}
-
-static void max98090_shdn_save(struct max98090_priv *max98090)
-{
-       mutex_lock_nested(&max98090->component->card->dapm_mutex,
-                         SND_SOC_DAPM_CLASS_RUNTIME);
-       max98090_shdn_save_locked(max98090);
-}
-
-static void max98090_shdn_restore(struct max98090_priv *max98090)
-{
-       max98090_shdn_restore_locked(max98090);
-       mutex_unlock(&max98090->component->card->dapm_mutex);
-}
-
-static int max98090_put_volsw(struct snd_kcontrol *kcontrol,
-       struct snd_ctl_elem_value *ucontrol)
-{
-       struct snd_soc_component *component =
-               snd_soc_kcontrol_component(kcontrol);
-       struct max98090_priv *max98090 =
-               snd_soc_component_get_drvdata(component);
-       int ret;
-
-       max98090_shdn_save(max98090);
-       ret = snd_soc_put_volsw(kcontrol, ucontrol);
-       max98090_shdn_restore(max98090);
-
-       return ret;
-}
-
-static int max98090_dapm_put_enum_double(struct snd_kcontrol *kcontrol,
-       struct snd_ctl_elem_value *ucontrol)
-{
-       struct snd_soc_component *component =
-               snd_soc_dapm_kcontrol_component(kcontrol);
-       struct max98090_priv *max98090 =
-               snd_soc_component_get_drvdata(component);
-       int ret;
-
-       max98090_shdn_save(max98090);
-       ret = snd_soc_dapm_put_enum_double_locked(kcontrol, ucontrol);
-       max98090_shdn_restore(max98090);
-
-       return ret;
-}
-
-static int max98090_put_enum_double(struct snd_kcontrol *kcontrol,
-       struct snd_ctl_elem_value *ucontrol)
-{
-       struct snd_soc_component *component =
-               snd_soc_kcontrol_component(kcontrol);
-       struct max98090_priv *max98090 =
-               snd_soc_component_get_drvdata(component);
-       int ret;
-
-       max98090_shdn_save(max98090);
-       ret = snd_soc_put_enum_double(kcontrol, ucontrol);
-       max98090_shdn_restore(max98090);
-
-       return ret;
-}
-
-static int max98090_bytes_put(struct snd_kcontrol *kcontrol,
-       struct snd_ctl_elem_value *ucontrol)
-{
-       struct snd_soc_component *component =
-               snd_soc_kcontrol_component(kcontrol);
-       struct max98090_priv *max98090 =
-               snd_soc_component_get_drvdata(component);
-       int ret;
-
-       max98090_shdn_save(max98090);
-       ret = snd_soc_bytes_put(kcontrol, ucontrol);
-       max98090_shdn_restore(max98090);
-
-       return ret;
-}
-
-static int max98090_dapm_event(struct snd_soc_dapm_widget *w,
-       struct snd_kcontrol *kcontrol, int event)
-{
-       struct snd_soc_component *component =
-               snd_soc_dapm_to_component(w->dapm);
-       struct max98090_priv *max98090 =
-               snd_soc_component_get_drvdata(component);
-
-       switch (event) {
-       case SND_SOC_DAPM_PRE_PMU:
-       case SND_SOC_DAPM_PRE_PMD:
-               max98090_shdn_save_locked(max98090);
-               break;
-       case SND_SOC_DAPM_POST_PMU:
-       case SND_SOC_DAPM_POST_PMD:
-               max98090_shdn_restore_locked(max98090);
-               break;
-       }
-
-       return 0;
-}
-
  /* Allows for sparsely populated register maps */
  static const struct reg_default max98090_reg[] = {
         { 0x00, 0x00 }, /* 00 Software Reset */
@@ -632,13 +506,10 @@ static SOC_ENUM_SINGLE_DECL(max98090_adchp_enum,
                             max98090_pwr_perf_text);
  
  static const struct snd_kcontrol_new max98090_snd_controls[] = {
-       SOC_ENUM_EXT("MIC Bias VCM Bandgap", max98090_vcmbandgap_enum,
-               snd_soc_get_enum_double, max98090_put_enum_double),
+       SOC_ENUM("MIC Bias VCM Bandgap", max98090_vcmbandgap_enum),
  
-       SOC_SINGLE_EXT("DMIC MIC Comp Filter Config",
-               M98090_REG_DIGITAL_MIC_CONFIG,
-               M98090_DMIC_COMP_SHIFT, M98090_DMIC_COMP_NUM - 1, 0,
-               snd_soc_get_volsw, max98090_put_volsw),
+       SOC_SINGLE("DMIC MIC Comp Filter Config", M98090_REG_DIGITAL_MIC_CONFIG,
+               M98090_DMIC_COMP_SHIFT, M98090_DMIC_COMP_NUM - 1, 0),
  
         SOC_SINGLE_EXT_TLV("MIC1 Boost Volume",
                 M98090_REG_MIC1_INPUT_LEVEL, M98090_MIC_PA1EN_SHIFT,
@@ -693,34 +564,24 @@ static const struct snd_kcontrol_new max98090_snd_controls[] = {
                 M98090_AVR_SHIFT, M98090_AVR_NUM - 1, 1,
                 max98090_av_tlv),
  
-       SOC_ENUM_EXT("ADC Oversampling Rate", max98090_osr128_enum,
-               snd_soc_get_enum_double, max98090_put_enum_double),
-       SOC_SINGLE_EXT("ADC Quantizer Dither", M98090_REG_ADC_CONTROL,
-               M98090_ADCDITHER_SHIFT, M98090_ADCDITHER_NUM - 1, 0,
-               snd_soc_get_volsw, max98090_put_volsw),
-       SOC_ENUM_EXT("ADC High Performance Mode", max98090_adchp_enum,
-               snd_soc_get_enum_double, max98090_put_enum_double),
-
-       SOC_SINGLE_EXT("DAC Mono Mode", M98090_REG_IO_CONFIGURATION,
-               M98090_DMONO_SHIFT, M98090_DMONO_NUM - 1, 0,
-               snd_soc_get_volsw, max98090_put_volsw),
-       SOC_SINGLE_EXT("SDIN Mode", M98090_REG_IO_CONFIGURATION,
-               M98090_SDIEN_SHIFT, M98090_SDIEN_NUM - 1, 0,
-               snd_soc_get_volsw, max98090_put_volsw),
-       SOC_SINGLE_EXT("SDOUT Mode", M98090_REG_IO_CONFIGURATION,
-               M98090_SDOEN_SHIFT, M98090_SDOEN_NUM - 1, 0,
-               snd_soc_get_volsw, max98090_put_volsw),
-       SOC_SINGLE_EXT("SDOUT Hi-Z Mode", M98090_REG_IO_CONFIGURATION,
-               M98090_HIZOFF_SHIFT, M98090_HIZOFF_NUM - 1, 1,
-               snd_soc_get_volsw, max98090_put_volsw),
-       SOC_ENUM_EXT("Filter Mode", max98090_mode_enum,
-               snd_soc_get_enum_double, max98090_put_enum_double),
-       SOC_SINGLE_EXT("Record Path DC Blocking", M98090_REG_FILTER_CONFIG,
-               M98090_AHPF_SHIFT, M98090_AHPF_NUM - 1, 0,
-               snd_soc_get_volsw, max98090_put_volsw),
-       SOC_SINGLE_EXT("Playback Path DC Blocking", M98090_REG_FILTER_CONFIG,
-               M98090_DHPF_SHIFT, M98090_DHPF_NUM - 1, 0,
-               snd_soc_get_volsw, max98090_put_volsw),
+       SOC_ENUM("ADC Oversampling Rate", max98090_osr128_enum),
+       SOC_SINGLE("ADC Quantizer Dither", M98090_REG_ADC_CONTROL,
+               M98090_ADCDITHER_SHIFT, M98090_ADCDITHER_NUM - 1, 0),
+       SOC_ENUM("ADC High Performance Mode", max98090_adchp_enum),
+
+       SOC_SINGLE("DAC Mono Mode", M98090_REG_IO_CONFIGURATION,
+               M98090_DMONO_SHIFT, M98090_DMONO_NUM - 1, 0),
+       SOC_SINGLE("SDIN Mode", M98090_REG_IO_CONFIGURATION,
+               M98090_SDIEN_SHIFT, M98090_SDIEN_NUM - 1, 0),
+       SOC_SINGLE("SDOUT Mode", M98090_REG_IO_CONFIGURATION,
+               M98090_SDOEN_SHIFT, M98090_SDOEN_NUM - 1, 0),
+       SOC_SINGLE("SDOUT Hi-Z Mode", M98090_REG_IO_CONFIGURATION,
+               M98090_HIZOFF_SHIFT, M98090_HIZOFF_NUM - 1, 1),
+       SOC_ENUM("Filter Mode", max98090_mode_enum),
+       SOC_SINGLE("Record Path DC Blocking", M98090_REG_FILTER_CONFIG,
+               M98090_AHPF_SHIFT, M98090_AHPF_NUM - 1, 0),
+       SOC_SINGLE("Playback Path DC Blocking", M98090_REG_FILTER_CONFIG,
+               M98090_DHPF_SHIFT, M98090_DHPF_NUM - 1, 0),
         SOC_SINGLE_TLV("Digital BQ Volume", M98090_REG_ADC_BIQUAD_LEVEL,
                 M98090_AVBQ_SHIFT, M98090_AVBQ_NUM - 1, 1, max98090_dv_tlv),
         SOC_SINGLE_EXT_TLV("Digital Sidetone Volume",
@@ -733,17 +594,13 @@ static const struct snd_kcontrol_new max98090_snd_controls[] = {
         SOC_SINGLE_TLV("Digital Volume", M98090_REG_DAI_PLAYBACK_LEVEL,
                 M98090_DV_SHIFT, M98090_DV_NUM - 1, 1,
                 max98090_dv_tlv),
-       SND_SOC_BYTES_E("EQ Coefficients", M98090_REG_EQUALIZER_BASE, 105,
-               snd_soc_bytes_get, max98090_bytes_put),
-       SOC_SINGLE_EXT("Digital EQ 3 Band Switch", M98090_REG_DSP_FILTER_ENABLE,
-               M98090_EQ3BANDEN_SHIFT, M98090_EQ3BANDEN_NUM - 1, 0,
-               snd_soc_get_volsw, max98090_put_volsw),
-       SOC_SINGLE_EXT("Digital EQ 5 Band Switch", M98090_REG_DSP_FILTER_ENABLE,
-               M98090_EQ5BANDEN_SHIFT, M98090_EQ5BANDEN_NUM - 1, 0,
-               snd_soc_get_volsw, max98090_put_volsw),
-       SOC_SINGLE_EXT("Digital EQ 7 Band Switch", M98090_REG_DSP_FILTER_ENABLE,
-               M98090_EQ7BANDEN_SHIFT, M98090_EQ7BANDEN_NUM - 1, 0,
-               snd_soc_get_volsw, max98090_put_volsw),
+       SND_SOC_BYTES("EQ Coefficients", M98090_REG_EQUALIZER_BASE, 105),
+       SOC_SINGLE("Digital EQ 3 Band Switch", M98090_REG_DSP_FILTER_ENABLE,
+               M98090_EQ3BANDEN_SHIFT, M98090_EQ3BANDEN_NUM - 1, 0),
+       SOC_SINGLE("Digital EQ 5 Band Switch", M98090_REG_DSP_FILTER_ENABLE,
+               M98090_EQ5BANDEN_SHIFT, M98090_EQ5BANDEN_NUM - 1, 0),
+       SOC_SINGLE("Digital EQ 7 Band Switch", M98090_REG_DSP_FILTER_ENABLE,
+               M98090_EQ7BANDEN_SHIFT, M98090_EQ7BANDEN_NUM - 1, 0),
         SOC_SINGLE("Digital EQ Clipping Detection", M98090_REG_DAI_PLAYBACK_LEVEL_EQ,
                 M98090_EQCLPN_SHIFT, M98090_EQCLPN_NUM - 1,
                 1),
@@ -751,34 +608,25 @@ static const struct snd_kcontrol_new max98090_snd_controls[] = {
                 M98090_DVEQ_SHIFT, M98090_DVEQ_NUM - 1, 1,
                 max98090_dv_tlv),
  
-       SOC_SINGLE_EXT("ALC Enable", M98090_REG_DRC_TIMING,
-               M98090_DRCEN_SHIFT, M98090_DRCEN_NUM - 1, 0,
-               snd_soc_get_volsw, max98090_put_volsw),
-       SOC_ENUM_EXT("ALC Attack Time", max98090_drcatk_enum,
-               snd_soc_get_enum_double, max98090_put_enum_double),
-       SOC_ENUM_EXT("ALC Release Time", max98090_drcrls_enum,
-               snd_soc_get_enum_double, max98090_put_enum_double),
+       SOC_SINGLE("ALC Enable", M98090_REG_DRC_TIMING,
+               M98090_DRCEN_SHIFT, M98090_DRCEN_NUM - 1, 0),
+       SOC_ENUM("ALC Attack Time", max98090_drcatk_enum),
+       SOC_ENUM("ALC Release Time", max98090_drcrls_enum),
         SOC_SINGLE_TLV("ALC Make Up Volume", M98090_REG_DRC_GAIN,
                 M98090_DRCG_SHIFT, M98090_DRCG_NUM - 1, 0,
                 max98090_alcmakeup_tlv),
-       SOC_ENUM_EXT("ALC Compression Ratio", max98090_alccmp_enum,
-               snd_soc_get_enum_double, max98090_put_enum_double),
-       SOC_ENUM_EXT("ALC Expansion Ratio", max98090_drcexp_enum,
-               snd_soc_get_enum_double, max98090_put_enum_double),
-       SOC_SINGLE_EXT_TLV("ALC Compression Threshold Volume",
+       SOC_ENUM("ALC Compression Ratio", max98090_alccmp_enum),
+       SOC_ENUM("ALC Expansion Ratio", max98090_drcexp_enum),
+       SOC_SINGLE_TLV("ALC Compression Threshold Volume",
                 M98090_REG_DRC_COMPRESSOR, M98090_DRCTHC_SHIFT,
-               M98090_DRCTHC_NUM - 1, 1,
-               snd_soc_get_volsw, max98090_put_volsw, max98090_alccomp_tlv),
-       SOC_SINGLE_EXT_TLV("ALC Expansion Threshold Volume",
+               M98090_DRCTHC_NUM - 1, 1, max98090_alccomp_tlv),
+       SOC_SINGLE_TLV("ALC Expansion Threshold Volume",
                 M98090_REG_DRC_EXPANDER, M98090_DRCTHE_SHIFT,
-               M98090_DRCTHE_NUM - 1, 1,
-               snd_soc_get_volsw, max98090_put_volsw, max98090_drcexp_tlv),
+               M98090_DRCTHE_NUM - 1, 1, max98090_drcexp_tlv),
  
-       SOC_ENUM_EXT("DAC HP Playback Performance Mode",
-               max98090_dac_perfmode_enum,
-               snd_soc_get_enum_double, max98090_put_enum_double),
-       SOC_ENUM_EXT("DAC High Performance Mode", max98090_dachp_enum,
-               snd_soc_get_enum_double, max98090_put_enum_double),
+       SOC_ENUM("DAC HP Playback Performance Mode",
+               max98090_dac_perfmode_enum),
+       SOC_ENUM("DAC High Performance Mode", max98090_dachp_enum),
  
         SOC_SINGLE_TLV("Headphone Left Mixer Volume",
                 M98090_REG_HP_CONTROL, M98090_MIXHPLG_SHIFT,
@@ -836,12 +684,9 @@ static const struct snd_kcontrol_new max98090_snd_controls[] = {
         SOC_SINGLE("Volume Adjustment Smoothing", M98090_REG_LEVEL_CONTROL,
                 M98090_VSENN_SHIFT, M98090_VSENN_NUM - 1, 1),
  
-       SND_SOC_BYTES_E("Biquad Coefficients",
-               M98090_REG_RECORD_BIQUAD_BASE, 15,
-               snd_soc_bytes_get, max98090_bytes_put),
-       SOC_SINGLE_EXT("Biquad Switch", M98090_REG_DSP_FILTER_ENABLE,
-               M98090_ADCBQEN_SHIFT, M98090_ADCBQEN_NUM - 1, 0,
-               snd_soc_get_volsw, max98090_put_volsw),
+       SND_SOC_BYTES("Biquad Coefficients", M98090_REG_RECORD_BIQUAD_BASE, 15),
+       SOC_SINGLE("Biquad Switch", M98090_REG_DSP_FILTER_ENABLE,
+               M98090_ADCBQEN_SHIFT, M98090_ADCBQEN_NUM - 1, 0),
  };
  
  static const struct snd_kcontrol_new max98091_snd_controls[] = {
@@ -850,12 +695,10 @@ static const struct snd_kcontrol_new max98091_snd_controls[] = {
                 M98090_DMIC34_ZEROPAD_SHIFT,
                 M98090_DMIC34_ZEROPAD_NUM - 1, 0),
  
-       SOC_ENUM_EXT("Filter DMIC34 Mode", max98090_filter_dmic34mode_enum,
-               snd_soc_get_enum_double, max98090_put_enum_double),
-       SOC_SINGLE_EXT("DMIC34 DC Blocking", M98090_REG_FILTER_CONFIG,
+       SOC_ENUM("Filter DMIC34 Mode", max98090_filter_dmic34mode_enum),
+       SOC_SINGLE("DMIC34 DC Blocking", M98090_REG_FILTER_CONFIG,
                 M98090_FLT_DMIC34HPF_SHIFT,
-               M98090_FLT_DMIC34HPF_NUM - 1, 0,
-               snd_soc_get_volsw, max98090_put_volsw),
+               M98090_FLT_DMIC34HPF_NUM - 1, 0),
  
         SOC_SINGLE_TLV("DMIC3 Boost Volume", M98090_REG_DMIC3_VOLUME,
                 M98090_DMIC_AV3G_SHIFT, M98090_DMIC_AV3G_NUM - 1, 0,
@@ -873,9 +716,8 @@ static const struct snd_kcontrol_new max98091_snd_controls[] = {
  
         SND_SOC_BYTES("DMIC34 Biquad Coefficients",
                 M98090_REG_DMIC34_BIQUAD_BASE, 15),
-       SOC_SINGLE_EXT("DMIC34 Biquad Switch", M98090_REG_DSP_FILTER_ENABLE,
-               M98090_DMIC34BQEN_SHIFT, M98090_DMIC34BQEN_NUM - 1, 0,
-               snd_soc_get_volsw, max98090_put_volsw),
+       SOC_SINGLE("DMIC34 Biquad Switch", M98090_REG_DSP_FILTER_ENABLE,
+               M98090_DMIC34BQEN_SHIFT, M98090_DMIC34BQEN_NUM - 1, 0),
  
         SOC_SINGLE_TLV("DMIC34 BQ PreAttenuation Volume",
                 M98090_REG_DMIC34_BQ_PREATTEN, M98090_AV34BQ_SHIFT,
@@ -929,6 +771,19 @@ static int max98090_micinput_event(struct snd_soc_dapm_widget *w,
         return 0;
  }
  
+static int max98090_shdn_event(struct snd_soc_dapm_widget *w,
+                                struct snd_kcontrol *kcontrol, int event)
+{
+       struct snd_soc_component *component = snd_soc_dapm_to_component(w->dapm);
+       struct max98090_priv *max98090 = snd_soc_component_get_drvdata(component);
+
+       if (event & SND_SOC_DAPM_POST_PMU)
+               max98090->shdn_pending = true;
+
+       return 0;
+
+}
+
  static const char *mic1_mux_text[] = { "IN12", "IN56" };
  
  static SOC_ENUM_SINGLE_DECL(mic1_mux_enum,
@@ -1029,14 +884,10 @@ static SOC_ENUM_SINGLE_DECL(ltenr_mux_enum,
                             lten_mux_text);
  
  static const struct snd_kcontrol_new max98090_ltenl_mux =
-       SOC_DAPM_ENUM_EXT("LTENL Mux", ltenl_mux_enum,
-                         snd_soc_dapm_get_enum_double,
-                         max98090_dapm_put_enum_double);
+       SOC_DAPM_ENUM("LTENL Mux", ltenl_mux_enum);
  
  static const struct snd_kcontrol_new max98090_ltenr_mux =
-       SOC_DAPM_ENUM_EXT("LTENR Mux", ltenr_mux_enum,
-                         snd_soc_dapm_get_enum_double,
-                         max98090_dapm_put_enum_double);
+       SOC_DAPM_ENUM("LTENR Mux", ltenr_mux_enum);
  
  static const char *lben_mux_text[] = { "Normal", "Loopback" };
  
@@ -1051,14 +902,10 @@ static SOC_ENUM_SINGLE_DECL(lbenr_mux_enum,
                             lben_mux_text);
  
  static const struct snd_kcontrol_new max98090_lbenl_mux =
-       SOC_DAPM_ENUM_EXT("LBENL Mux", lbenl_mux_enum,
-                         snd_soc_dapm_get_enum_double,
-                         max98090_dapm_put_enum_double);
+       SOC_DAPM_ENUM("LBENL Mux", lbenl_mux_enum);
  
  static const struct snd_kcontrol_new max98090_lbenr_mux =
-       SOC_DAPM_ENUM_EXT("LBENR Mux", lbenr_mux_enum,
-                         snd_soc_dapm_get_enum_double,
-                         max98090_dapm_put_enum_double);
+       SOC_DAPM_ENUM("LBENR Mux", lbenr_mux_enum);
  
  static const char *stenl_mux_text[] = { "Normal", "Sidetone Left" };
  
@@ -1225,25 +1072,21 @@ static const struct snd_soc_dapm_widget max98090_dapm_widgets[] = {
         SND_SOC_DAPM_INPUT("IN56"),
  
         SND_SOC_DAPM_SUPPLY("MICBIAS", M98090_REG_INPUT_ENABLE,
-               M98090_MBEN_SHIFT, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
+               M98090_MBEN_SHIFT, 0, NULL, 0),
         SND_SOC_DAPM_SUPPLY("SHDN", M98090_REG_DEVICE_SHUTDOWN,
                 M98090_SHDNN_SHIFT, 0, NULL, 0),
         SND_SOC_DAPM_SUPPLY("SDIEN", M98090_REG_IO_CONFIGURATION,
-               M98090_SDIEN_SHIFT, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
+               M98090_SDIEN_SHIFT, 0, NULL, 0),
         SND_SOC_DAPM_SUPPLY("SDOEN", M98090_REG_IO_CONFIGURATION,
-               M98090_SDOEN_SHIFT, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
+               M98090_SDOEN_SHIFT, 0, NULL, 0),
         SND_SOC_DAPM_SUPPLY("DMICL_ENA", M98090_REG_DIGITAL_MIC_ENABLE,
-               M98090_DIGMICL_SHIFT, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
+                M98090_DIGMICL_SHIFT, 0, max98090_shdn_event,
+                       SND_SOC_DAPM_POST_PMU),
         SND_SOC_DAPM_SUPPLY("DMICR_ENA", M98090_REG_DIGITAL_MIC_ENABLE,
-               M98090_DIGMICR_SHIFT, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
+                M98090_DIGMICR_SHIFT, 0, max98090_shdn_event,
+                        SND_SOC_DAPM_POST_PMU),
         SND_SOC_DAPM_SUPPLY("AHPF", M98090_REG_FILTER_CONFIG,
-               M98090_AHPF_SHIFT, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
+               M98090_AHPF_SHIFT, 0, NULL, 0),
  
  /*
   * Note: Sysclk and misc power supplies are taken care of by SHDN
@@ -1273,12 +1116,10 @@ static const struct snd_soc_dapm_widget max98090_dapm_widgets[] = {
                 &max98090_lineb_mixer_controls[0],
                 ARRAY_SIZE(max98090_lineb_mixer_controls)),
  
-       SND_SOC_DAPM_PGA_E("LINEA Input", M98090_REG_INPUT_ENABLE,
-               M98090_LINEAEN_SHIFT, 0, NULL, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
-       SND_SOC_DAPM_PGA_E("LINEB Input", M98090_REG_INPUT_ENABLE,
-               M98090_LINEBEN_SHIFT, 0, NULL, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
+       SND_SOC_DAPM_PGA("LINEA Input", M98090_REG_INPUT_ENABLE,
+               M98090_LINEAEN_SHIFT, 0, NULL, 0),
+       SND_SOC_DAPM_PGA("LINEB Input", M98090_REG_INPUT_ENABLE,
+               M98090_LINEBEN_SHIFT, 0, NULL, 0),
  
         SND_SOC_DAPM_MIXER("Left ADC Mixer", SND_SOC_NOPM, 0, 0,
                 &max98090_left_adc_mixer_controls[0],
@@ -1289,11 +1130,11 @@ static const struct snd_soc_dapm_widget max98090_dapm_widgets[] = {
                 ARRAY_SIZE(max98090_right_adc_mixer_controls)),
  
         SND_SOC_DAPM_ADC_E("ADCL", NULL, M98090_REG_INPUT_ENABLE,
-               M98090_ADLEN_SHIFT, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
+               M98090_ADLEN_SHIFT, 0, max98090_shdn_event,
+               SND_SOC_DAPM_POST_PMU),
         SND_SOC_DAPM_ADC_E("ADCR", NULL, M98090_REG_INPUT_ENABLE,
-               M98090_ADREN_SHIFT, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
+               M98090_ADREN_SHIFT, 0, max98090_shdn_event,
+               SND_SOC_DAPM_POST_PMU),
  
         SND_SOC_DAPM_AIF_OUT("AIFOUTL", "HiFi Capture", 0,
                 SND_SOC_NOPM, 0, 0),
@@ -1321,12 +1162,10 @@ static const struct snd_soc_dapm_widget max98090_dapm_widgets[] = {
         SND_SOC_DAPM_AIF_IN("AIFINL", "HiFi Playback", 0, SND_SOC_NOPM, 0, 0),
         SND_SOC_DAPM_AIF_IN("AIFINR", "HiFi Playback", 1, SND_SOC_NOPM, 0, 0),
  
-       SND_SOC_DAPM_DAC_E("DACL", NULL, M98090_REG_OUTPUT_ENABLE,
-               M98090_DALEN_SHIFT, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
-       SND_SOC_DAPM_DAC_E("DACR", NULL, M98090_REG_OUTPUT_ENABLE,
-               M98090_DAREN_SHIFT, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
+       SND_SOC_DAPM_DAC("DACL", NULL, M98090_REG_OUTPUT_ENABLE,
+               M98090_DALEN_SHIFT, 0),
+       SND_SOC_DAPM_DAC("DACR", NULL, M98090_REG_OUTPUT_ENABLE,
+               M98090_DAREN_SHIFT, 0),
  
         SND_SOC_DAPM_MIXER("Left Headphone Mixer", SND_SOC_NOPM, 0, 0,
                 &max98090_left_hp_mixer_controls[0],
@@ -1361,26 +1200,20 @@ static const struct snd_soc_dapm_widget max98090_dapm_widgets[] = {
         SND_SOC_DAPM_MUX("MIXHPRSEL Mux", SND_SOC_NOPM, 0, 0,
                 &max98090_mixhprsel_mux),
  
-       SND_SOC_DAPM_PGA_E("HP Left Out", M98090_REG_OUTPUT_ENABLE,
-               M98090_HPLEN_SHIFT, 0, NULL, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
-       SND_SOC_DAPM_PGA_E("HP Right Out", M98090_REG_OUTPUT_ENABLE,
-               M98090_HPREN_SHIFT, 0, NULL, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
-
-       SND_SOC_DAPM_PGA_E("SPK Left Out", M98090_REG_OUTPUT_ENABLE,
-               M98090_SPLEN_SHIFT, 0, NULL, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
-       SND_SOC_DAPM_PGA_E("SPK Right Out", M98090_REG_OUTPUT_ENABLE,
-               M98090_SPREN_SHIFT, 0, NULL, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
-
-       SND_SOC_DAPM_PGA_E("RCV Left Out", M98090_REG_OUTPUT_ENABLE,
-               M98090_RCVLEN_SHIFT, 0, NULL, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
-       SND_SOC_DAPM_PGA_E("RCV Right Out", M98090_REG_OUTPUT_ENABLE,
-               M98090_RCVREN_SHIFT, 0, NULL, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
+       SND_SOC_DAPM_PGA("HP Left Out", M98090_REG_OUTPUT_ENABLE,
+               M98090_HPLEN_SHIFT, 0, NULL, 0),
+       SND_SOC_DAPM_PGA("HP Right Out", M98090_REG_OUTPUT_ENABLE,
+               M98090_HPREN_SHIFT, 0, NULL, 0),
+
+       SND_SOC_DAPM_PGA("SPK Left Out", M98090_REG_OUTPUT_ENABLE,
+               M98090_SPLEN_SHIFT, 0, NULL, 0),
+       SND_SOC_DAPM_PGA("SPK Right Out", M98090_REG_OUTPUT_ENABLE,
+               M98090_SPREN_SHIFT, 0, NULL, 0),
+
+       SND_SOC_DAPM_PGA("RCV Left Out", M98090_REG_OUTPUT_ENABLE,
+               M98090_RCVLEN_SHIFT, 0, NULL, 0),
+       SND_SOC_DAPM_PGA("RCV Right Out", M98090_REG_OUTPUT_ENABLE,
+               M98090_RCVREN_SHIFT, 0, NULL, 0),
  
         SND_SOC_DAPM_OUTPUT("HPL"),
         SND_SOC_DAPM_OUTPUT("HPR"),
@@ -1395,11 +1228,9 @@ static const struct snd_soc_dapm_widget max98091_dapm_widgets[] = {
         SND_SOC_DAPM_INPUT("DMIC4"),
  
         SND_SOC_DAPM_SUPPLY("DMIC3_ENA", M98090_REG_DIGITAL_MIC_ENABLE,
-               M98090_DIGMIC3_SHIFT, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
+                M98090_DIGMIC3_SHIFT, 0, NULL, 0),
         SND_SOC_DAPM_SUPPLY("DMIC4_ENA", M98090_REG_DIGITAL_MIC_ENABLE,
-               M98090_DIGMIC4_SHIFT, 0, max98090_dapm_event,
-               SND_SOC_DAPM_PRE_POST_PMU | SND_SOC_DAPM_PRE_POST_PMD),
+                M98090_DIGMIC4_SHIFT, 0, NULL, 0),
  };
  
  static const struct snd_soc_dapm_route max98090_dapm_routes[] = {
@@ -1670,11 +1501,6 @@ static void max98090_configure_bclk(struct snd_soc_component *component)
                 return;
         }
  
-       /*
-        * Master mode: no need to save and restore SHDN for the following
-        * sensitive registers.
-        */
-
         /* Check for supported PCLK to LRCLK ratios */
         for (i = 0; i < ARRAY_SIZE(pclk_rates); i++) {
                 if ((pclk_rates[i] == max98090->sysclk) &&
@@ -1761,14 +1587,12 @@ static int max98090_dai_set_fmt(struct snd_soc_dai *codec_dai,
                 switch (fmt & SND_SOC_DAIFMT_MASTER_MASK) {
                 case SND_SOC_DAIFMT_CBS_CFS:
                         /* Set to slave mode PLL - MAS mode off */
-                       max98090_shdn_save(max98090);
                         snd_soc_component_write(component,
                                 M98090_REG_CLOCK_RATIO_NI_MSB, 0x00);
                         snd_soc_component_write(component,
                                 M98090_REG_CLOCK_RATIO_NI_LSB, 0x00);
                         snd_soc_component_update_bits(component, M98090_REG_CLOCK_MODE,
                                 M98090_USE_M1_MASK, 0);
-                       max98090_shdn_restore(max98090);
                         max98090->master = false;
                         break;
                 case SND_SOC_DAIFMT_CBM_CFM:
@@ -1794,9 +1618,7 @@ static int max98090_dai_set_fmt(struct snd_soc_dai *codec_dai,
                         dev_err(component->dev, "DAI clock mode unsupported");
                         return -EINVAL;
                 }
-               max98090_shdn_save(max98090);
                 snd_soc_component_write(component, M98090_REG_MASTER_MODE, regval);
-               max98090_shdn_restore(max98090);
  
                 regval = 0;
                 switch (fmt & SND_SOC_DAIFMT_FORMAT_MASK) {
@@ -1841,10 +1663,8 @@ static int max98090_dai_set_fmt(struct snd_soc_dai *codec_dai,
                 if (max98090->tdm_slots > 1)
                         regval ^= M98090_BCI_MASK;
  
-               max98090_shdn_save(max98090);
                 snd_soc_component_write(component,
                         M98090_REG_INTERFACE_FORMAT, regval);
-               max98090_shdn_restore(max98090);
         }
  
         return 0;
@@ -1856,7 +1676,6 @@ static int max98090_set_tdm_slot(struct snd_soc_dai *codec_dai,
         struct snd_soc_component *component = codec_dai->component;
         struct max98090_priv *max98090 = snd_soc_component_get_drvdata(component);
         struct max98090_cdata *cdata;
-
         cdata = &max98090->dai[0];
  
         if (slots < 0 || slots > 4)
@@ -1866,7 +1685,6 @@ static int max98090_set_tdm_slot(struct snd_soc_dai *codec_dai,
         max98090->tdm_width = slot_width;
  
         if (max98090->tdm_slots > 1) {
-               max98090_shdn_save(max98090);
                 /* SLOTL SLOTR SLOTDLY */
                 snd_soc_component_write(component, M98090_REG_TDM_FORMAT,
                         0 << M98090_TDM_SLOTL_SHIFT |
@@ -1877,7 +1695,6 @@ static int max98090_set_tdm_slot(struct snd_soc_dai *codec_dai,
                 snd_soc_component_update_bits(component, M98090_REG_TDM_CONTROL,
                         M98090_TDM_MASK,
                         M98090_TDM_MASK);
-               max98090_shdn_restore(max98090);
         }
  
         /*
@@ -2077,7 +1894,6 @@ static int max98090_configure_dmic(struct max98090_priv *max98090,
         dmic_freq = dmic_table[pclk_index].settings[micclk_index].freq;
         dmic_comp = dmic_table[pclk_index].settings[micclk_index].comp[i];
  
-       max98090_shdn_save(max98090);
         regmap_update_bits(max98090->regmap, M98090_REG_DIGITAL_MIC_ENABLE,
                            M98090_MICCLK_MASK,
                            micclk_index << M98090_MICCLK_SHIFT);
@@ -2086,7 +1902,6 @@ static int max98090_configure_dmic(struct max98090_priv *max98090,
                            M98090_DMIC_COMP_MASK | M98090_DMIC_FREQ_MASK,
                            dmic_comp << M98090_DMIC_COMP_SHIFT |
                            dmic_freq << M98090_DMIC_FREQ_SHIFT);
-       max98090_shdn_restore(max98090);
  
         return 0;
  }
@@ -2123,10 +1938,8 @@ static int max98090_dai_hw_params(struct snd_pcm_substream *substream,
  
         switch (params_width(params)) {
         case 16:
-               max98090_shdn_save(max98090);
                 snd_soc_component_update_bits(component, M98090_REG_INTERFACE_FORMAT,
                         M98090_WS_MASK, 0);
-               max98090_shdn_restore(max98090);
                 break;
         default:
                 return -EINVAL;
@@ -2137,7 +1950,6 @@ static int max98090_dai_hw_params(struct snd_pcm_substream *substream,
  
         cdata->rate = max98090->lrclk;
  
-       max98090_shdn_save(max98090);
         /* Update filter mode */
         if (max98090->lrclk < 24000)
                 snd_soc_component_update_bits(component, M98090_REG_FILTER_CONFIG,
@@ -2153,7 +1965,6 @@ static int max98090_dai_hw_params(struct snd_pcm_substream *substream,
         else
                 snd_soc_component_update_bits(component, M98090_REG_FILTER_CONFIG,
                         M98090_DHF_MASK, M98090_DHF_MASK);
-       max98090_shdn_restore(max98090);
  
         max98090_configure_dmic(max98090, max98090->dmic_freq, max98090->pclk,
                                 max98090->lrclk);
@@ -2184,7 +1995,6 @@ static int max98090_dai_set_sysclk(struct snd_soc_dai *dai,
          *               0x02 (when master clk is 20MHz to 40MHz)..
          *               0x03 (when master clk is 40MHz to 60MHz)..
          */
-       max98090_shdn_save(max98090);
         if ((freq >= 10000000) && (freq <= 20000000)) {
                 snd_soc_component_write(component, M98090_REG_SYSTEM_CLOCK,
                         M98090_PSCLK_DIV1);
@@ -2199,10 +2009,8 @@ static int max98090_dai_set_sysclk(struct snd_soc_dai *dai,
                 max98090->pclk = freq >> 2;
         } else {
                 dev_err(component->dev, "Invalid master clock frequency\n");
-               max98090_shdn_restore(max98090);
                 return -EINVAL;
         }
-       max98090_shdn_restore(max98090);
  
         max98090->sysclk = freq;
  
@@ -2314,12 +2122,10 @@ static void max98090_pll_work(struct max98090_priv *max98090)
          */
  
         /* Toggle shutdown OFF then ON */
-       mutex_lock(&component->card->dapm_mutex);
         snd_soc_component_update_bits(component, M98090_REG_DEVICE_SHUTDOWN,
                             M98090_SHDNN_MASK, 0);
         snd_soc_component_update_bits(component, M98090_REG_DEVICE_SHUTDOWN,
                             M98090_SHDNN_MASK, M98090_SHDNN_MASK);
-       mutex_unlock(&component->card->dapm_mutex);
  
         for (i = 0; i < 10; ++i) {
                 /* Give PLL time to lock */
@@ -2642,12 +2448,7 @@ static int max98090_probe(struct snd_soc_component *component)
          */
         snd_soc_component_read32(component, M98090_REG_DEVICE_STATUS);
  
-       /*
-        * SHDN should be 0 at the point, no need to save/restore for the
-        * following registers.
-        *
-        * High Performance is default
-        */
+       /* High Performance is default */
         snd_soc_component_update_bits(component, M98090_REG_DAC_CONTROL,
                 M98090_DACHP_MASK,
                 1 << M98090_DACHP_SHIFT);
@@ -2658,12 +2459,7 @@ static int max98090_probe(struct snd_soc_component *component)
                 M98090_ADCHP_MASK,
                 1 << M98090_ADCHP_SHIFT);
  
-       /*
-        * SHDN should be 0 at the point, no need to save/restore for the
-        * following registers.
-        *
-        * Turn on VCM bandgap reference
-        */
+       /* Turn on VCM bandgap reference */
         snd_soc_component_write(component, M98090_REG_BIAS_CONTROL,
                 M98090_VCM_MODE_MASK);
  
@@ -2695,9 +2491,25 @@ static void max98090_remove(struct snd_soc_component *component)
         max98090->component = NULL;
  }
  
+static void max98090_seq_notifier(struct snd_soc_component *component,
+       enum snd_soc_dapm_type event, int subseq)
+{
+       struct max98090_priv *max98090 = snd_soc_component_get_drvdata(component);
+
+       if (max98090->shdn_pending) {
+               snd_soc_component_update_bits(component, M98090_REG_DEVICE_SHUTDOWN,
+                               M98090_SHDNN_MASK, 0);
+               msleep(40);
+               snd_soc_component_update_bits(component, M98090_REG_DEVICE_SHUTDOWN,
+                               M98090_SHDNN_MASK, M98090_SHDNN_MASK);
+               max98090->shdn_pending = false;
+       }
+}
+
  static const struct snd_soc_component_driver soc_component_dev_max98090 = {
         .probe                  = max98090_probe,
         .remove                 = max98090_remove,
+       .seq_notifier           = max98090_seq_notifier,
         .set_bias_level         = max98090_set_bias_level,
         .idle_bias_on           = 1,
         .use_pmdown_time        = 1,
diff --git a/sound/soc/codecs/max98090.h b/sound/soc/codecs/max98090.h

index 0a31708b7df7f7ad19083d360e4b23fb347c610c..a197114b0dad3fc648952b8f272fb8b87c1a77c1 100644 (file)
--- a/sound/soc/codecs/max98090.h
+++ b/sound/soc/codecs/max98090.h
@@ -1539,8 +1539,7 @@ struct max98090_priv {
         unsigned int pa2en;
         unsigned int sidetone;
         bool master;
-       int saved_count;
-       int saved_shdn;
+       bool shdn_pending;
  };
  
  int max98090_mic_detect(struct snd_soc_component *component,
diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c

index 8c3ea73009721f8fb89c8e4d54ca47b8b0fb50a8..9d436b0c5718a2531049e7e8ba7d025b6f1fe4c6 100644 (file)
--- a/sound/soc/fsl/fsl_sai.c
+++ b/sound/soc/fsl/fsl_sai.c
@@ -1020,12 +1020,24 @@ static int fsl_sai_probe(struct platform_device *pdev)
         ret = devm_snd_soc_register_component(&pdev->dev, &fsl_component,
                         &fsl_sai_dai, 1);
         if (ret)
-               return ret;
+               goto err_pm_disable;
  
-       if (sai->soc_data->use_imx_pcm)
-               return imx_pcm_dma_init(pdev, IMX_SAI_DMABUF_SIZE);
-       else
-               return devm_snd_dmaengine_pcm_register(&pdev->dev, NULL, 0);
+       if (sai->soc_data->use_imx_pcm) {
+               ret = imx_pcm_dma_init(pdev, IMX_SAI_DMABUF_SIZE);
+               if (ret)
+                       goto err_pm_disable;
+       } else {
+               ret = devm_snd_dmaengine_pcm_register(&pdev->dev, NULL, 0);
+               if (ret)
+                       goto err_pm_disable;
+       }
+
+       return ret;
+
+err_pm_disable:
+       pm_runtime_disable(&pdev->dev);
+
+       return ret;
  }
  
  static int fsl_sai_remove(struct platform_device *pdev)
diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c

index bc20ad9abf8bddc119381f1ddbf010b766f7fc8d..9b130561d562f8bec717b9511ec08618a987aa4e 100644 (file)
--- a/sound/soc/soc-dapm.c
+++ b/sound/soc/soc-dapm.c
@@ -3441,8 +3441,17 @@ int snd_soc_dapm_get_enum_double(struct snd_kcontrol *kcontrol,
  }
  EXPORT_SYMBOL_GPL(snd_soc_dapm_get_enum_double);
  
-static int __snd_soc_dapm_put_enum_double(struct snd_kcontrol *kcontrol,
-       struct snd_ctl_elem_value *ucontrol, int locked)
+/**
+ * snd_soc_dapm_put_enum_double - dapm enumerated double mixer set callback
+ * @kcontrol: mixer control
+ * @ucontrol: control element information
+ *
+ * Callback to set the value of a dapm enumerated double mixer control.
+ *
+ * Returns 0 for success.
+ */
+int snd_soc_dapm_put_enum_double(struct snd_kcontrol *kcontrol,
+       struct snd_ctl_elem_value *ucontrol)
  {
         struct snd_soc_dapm_context *dapm = snd_soc_dapm_kcontrol_dapm(kcontrol);
         struct snd_soc_card *card = dapm->card;
@@ -3465,9 +3474,7 @@ static int __snd_soc_dapm_put_enum_double(struct snd_kcontrol *kcontrol,
                 mask |= e->mask << e->shift_r;
         }
  
-       if (!locked)
-               mutex_lock_nested(&card->dapm_mutex,
-                                 SND_SOC_DAPM_CLASS_RUNTIME);
+       mutex_lock_nested(&card->dapm_mutex, SND_SOC_DAPM_CLASS_RUNTIME);
  
         change = dapm_kcontrol_set_value(kcontrol, val);
  
@@ -3489,50 +3496,15 @@ static int __snd_soc_dapm_put_enum_double(struct snd_kcontrol *kcontrol,
                 card->update = NULL;
         }
  
-       if (!locked)
-               mutex_unlock(&card->dapm_mutex);
+       mutex_unlock(&card->dapm_mutex);
  
         if (ret > 0)
                 soc_dpcm_runtime_update(card);
  
         return change;
  }
-
-/**
- * snd_soc_dapm_put_enum_double - dapm enumerated double mixer set callback
- * @kcontrol: mixer control
- * @ucontrol: control element information
- *
- * Callback to set the value of a dapm enumerated double mixer control.
- *
- * Returns 0 for success.
- */
-int snd_soc_dapm_put_enum_double(struct snd_kcontrol *kcontrol,
-       struct snd_ctl_elem_value *ucontrol)
-{
-       return __snd_soc_dapm_put_enum_double(kcontrol, ucontrol, 0);
-}
  EXPORT_SYMBOL_GPL(snd_soc_dapm_put_enum_double);
  
-/**
- * snd_soc_dapm_put_enum_double_locked - dapm enumerated double mixer set
- * callback
- * @kcontrol: mixer control
- * @ucontrol: control element information
- *
- * Callback to set the value of a dapm enumerated double mixer control.
- * Must acquire dapm_mutex before calling the function.
- *
- * Returns 0 for success.
- */
-int snd_soc_dapm_put_enum_double_locked(struct snd_kcontrol *kcontrol,
-       struct snd_ctl_elem_value *ucontrol)
-{
-       dapm_assert_locked(snd_soc_dapm_kcontrol_dapm(kcontrol));
-       return __snd_soc_dapm_put_enum_double(kcontrol, ucontrol, 1);
-}
-EXPORT_SYMBOL_GPL(snd_soc_dapm_put_enum_double_locked);
-
  /**
   * snd_soc_dapm_info_pin_switch - Info for a pin switch
   *
@@ -3916,9 +3888,6 @@ snd_soc_dai_link_event_pre_pmu(struct snd_soc_dapm_widget *w,
         runtime->rate = params_rate(params);
  
  out:
-       if (ret < 0)
-               kfree(runtime);
-
         kfree(params);
         return ret;
  }
diff --git a/sound/soc/sof/intel/hda-codec.c b/sound/soc/sof/intel/hda-codec.c

index 9106ab8dac6f62d6efc0251c5b26770ff1e856fb..ff45075ef7203809acb355a6f6c6b028e9daf94f 100644 (file)
--- a/sound/soc/sof/intel/hda-codec.c
+++ b/sound/soc/sof/intel/hda-codec.c
@@ -174,8 +174,10 @@ void hda_codec_i915_display_power(struct snd_sof_dev *sdev, bool enable)
  {
         struct hdac_bus *bus = sof_to_bus(sdev);
  
-       dev_dbg(bus->dev, "Turning i915 HDAC power %d\n", enable);
-       snd_hdac_display_power(bus, HDA_CODEC_IDX_CONTROLLER, enable);
+       if (HDA_IDISP_CODEC(bus->codec_mask)) {
+               dev_dbg(bus->dev, "Turning i915 HDAC power %d\n", enable);
+               snd_hdac_display_power(bus, HDA_CODEC_IDX_CONTROLLER, enable);
+       }
  }
  EXPORT_SYMBOL_NS(hda_codec_i915_display_power, SND_SOC_SOF_HDA_AUDIO_CODEC_I915);
  
@@ -189,7 +191,8 @@ int hda_codec_i915_init(struct snd_sof_dev *sdev)
         if (ret < 0)
                 return ret;
  
-       hda_codec_i915_display_power(sdev, true);
+       /* codec_mask not yet known, power up for probe */
+       snd_hdac_display_power(bus, HDA_CODEC_IDX_CONTROLLER, true);
  
         return 0;
  }
@@ -200,7 +203,8 @@ int hda_codec_i915_exit(struct snd_sof_dev *sdev)
         struct hdac_bus *bus = sof_to_bus(sdev);
         int ret;
  
-       hda_codec_i915_display_power(sdev, false);
+       /* power down unconditionally */
+       snd_hdac_display_power(bus, HDA_CODEC_IDX_CONTROLLER, false);
  
         ret = snd_hdac_i915_exit(bus);
  
diff --git a/sound/soc/sof/intel/hda-dsp.c b/sound/soc/sof/intel/hda-dsp.c

index 4a4d318f97ffa5698217c0282bb24321f94d4b4b..0848b79967a9f426e4c7d6124906eb51fe25bed9 100644 (file)
--- a/sound/soc/sof/intel/hda-dsp.c
+++ b/sound/soc/sof/intel/hda-dsp.c
@@ -428,6 +428,9 @@ static int hda_suspend(struct snd_sof_dev *sdev, bool runtime_suspend)
                 return ret;
         }
  
+       /* display codec can powered off after link reset */
+       hda_codec_i915_display_power(sdev, false);
+
         return 0;
  }
  
@@ -439,6 +442,9 @@ static int hda_resume(struct snd_sof_dev *sdev, bool runtime_resume)
  #endif
         int ret;
  
+       /* display codec must be powered before link reset */
+       hda_codec_i915_display_power(sdev, true);
+
         /*
          * clear TCSEL to clear playback on some HD Audio
          * codecs. PCI TCSEL is defined in the Intel manuals.
@@ -482,6 +488,8 @@ int hda_dsp_resume(struct snd_sof_dev *sdev)
         struct pci_dev *pci = to_pci_dev(sdev->dev);
  
         if (sdev->s0_suspend) {
+               hda_codec_i915_display_power(sdev, true);
+
                 /* restore L1SEN bit */
                 if (hda->l1_support_changed)
                         snd_sof_dsp_update_bits(sdev, HDA_DSP_HDA_BAR,
@@ -531,6 +539,9 @@ int hda_dsp_suspend(struct snd_sof_dev *sdev)
         int ret;
  
         if (sdev->s0_suspend) {
+               /* we can't keep a wakeref to display driver at suspend */
+               hda_codec_i915_display_power(sdev, false);
+
                 /* enable L1SEN to make sure the system can enter S0Ix */
                 hda->l1_support_changed =
                         snd_sof_dsp_update_bits(sdev, HDA_DSP_HDA_BAR,
diff --git a/sound/soc/sof/intel/hda.c b/sound/soc/sof/intel/hda.c

index 65b86dd044f101b0255c9be51d01961469e37407..25946a1c28224c69e2925a4b477a854769e47d62 100644 (file)
--- a/sound/soc/sof/intel/hda.c
+++ b/sound/soc/sof/intel/hda.c
@@ -286,6 +286,13 @@ static int hda_init(struct snd_sof_dev *sdev)
         /* HDA base */
         sdev->bar[HDA_DSP_HDA_BAR] = bus->remap_addr;
  
+       /* init i915 and HDMI codecs */
+       ret = hda_codec_i915_init(sdev);
+       if (ret < 0) {
+               dev_err(sdev->dev, "error: init i915 and HDMI codec failed\n");
+               return ret;
+       }
+
         /* get controller capabilities */
         ret = hda_dsp_ctrl_get_caps(sdev);
         if (ret < 0)
@@ -353,15 +360,6 @@ static int hda_init_caps(struct snd_sof_dev *sdev)
         if (bus->ppcap)
                 dev_dbg(sdev->dev, "PP capability, will probe DSP later.\n");
  
-#if IS_ENABLED(CONFIG_SND_SOC_SOF_HDA)
-       /* init i915 and HDMI codecs */
-       ret = hda_codec_i915_init(sdev);
-       if (ret < 0) {
-               dev_err(sdev->dev, "error: init i915 and HDMI codec failed\n");
-               return ret;
-       }
-#endif
-
         /* Init HDA controller after i915 init */
         ret = hda_dsp_ctrl_init_chip(sdev, true);
         if (ret < 0) {
@@ -381,7 +379,7 @@ static int hda_init_caps(struct snd_sof_dev *sdev)
         hda_codec_probe_bus(sdev, hda_codec_use_common_hdmi);
  
         if (!HDA_IDISP_CODEC(bus->codec_mask))
-               hda_codec_i915_display_power(sdev, false);
+               hda_codec_i915_exit(sdev);
  
         /*
          * we are done probing so decrement link counts
@@ -611,6 +609,7 @@ free_streams:
         iounmap(sdev->bar[HDA_DSP_BAR]);
  hdac_bus_unmap:
         iounmap(bus->remap_addr);
+       hda_codec_i915_exit(sdev);
  err:
         return ret;
  }
diff --git a/sound/soc/sunxi/sun8i-codec.c b/sound/soc/sunxi/sun8i-codec.c

index 55798bc8eae29d27c9b85d548b4fb8d8ef355ecc..686561df8e13b30ef1672d7b7aa3c80926e7e479 100644 (file)
--- a/sound/soc/sunxi/sun8i-codec.c
+++ b/sound/soc/sunxi/sun8i-codec.c
@@ -80,6 +80,7 @@
  
  #define SUN8I_SYS_SR_CTRL_AIF1_FS_MASK         GENMASK(15, 12)
  #define SUN8I_SYS_SR_CTRL_AIF2_FS_MASK         GENMASK(11, 8)
+#define SUN8I_AIF1CLK_CTRL_AIF1_DATA_FMT_MASK  GENMASK(3, 2)
  #define SUN8I_AIF1CLK_CTRL_AIF1_WORD_SIZ_MASK  GENMASK(5, 4)
  #define SUN8I_AIF1CLK_CTRL_AIF1_LRCK_DIV_MASK  GENMASK(8, 6)
  #define SUN8I_AIF1CLK_CTRL_AIF1_BCLK_DIV_MASK  GENMASK(12, 9)
@@ -241,7 +242,7 @@ static int sun8i_set_fmt(struct snd_soc_dai *dai, unsigned int fmt)
                 return -EINVAL;
         }
         regmap_update_bits(scodec->regmap, SUN8I_AIF1CLK_CTRL,
-                          BIT(SUN8I_AIF1CLK_CTRL_AIF1_DATA_FMT),
+                          SUN8I_AIF1CLK_CTRL_AIF1_DATA_FMT_MASK,
                            value << SUN8I_AIF1CLK_CTRL_AIF1_DATA_FMT);
  
         return 0;
diff --git a/sound/usb/clock.c b/sound/usb/clock.c

index 018b1ecb5404655e327b35ca10c420f965383d3d..a48313dfa967a15836785b9961ffb7df346e5f7a 100644 (file)
--- a/sound/usb/clock.c
+++ b/sound/usb/clock.c
@@ -151,8 +151,34 @@ static int uac_clock_selector_set_val(struct snd_usb_audio *chip, int selector_i
         return ret;
  }
  
+/*
+ * Assume the clock is valid if clock source supports only one single sample
+ * rate, the terminal is connected directly to it (there is no clock selector)
+ * and clock type is internal. This is to deal with some Denon DJ controllers
+ * that always reports that clock is invalid.
+ */
+static bool uac_clock_source_is_valid_quirk(struct snd_usb_audio *chip,
+                                           struct audioformat *fmt,
+                                           int source_id)
+{
+       if (fmt->protocol == UAC_VERSION_2) {
+               struct uac_clock_source_descriptor *cs_desc =
+                       snd_usb_find_clock_source(chip->ctrl_intf, source_id);
+
+               if (!cs_desc)
+                       return false;
+
+               return (fmt->nr_rates == 1 &&
+                       (fmt->clock & 0xff) == cs_desc->bClockID &&
+                       (cs_desc->bmAttributes & 0x3) !=
+                               UAC_CLOCK_SOURCE_TYPE_EXT);
+       }
+
+       return false;
+}
+
  static bool uac_clock_source_is_valid(struct snd_usb_audio *chip,
-                                     int protocol,
+                                     struct audioformat *fmt,
                                       int source_id)
  {
         int err;
@@ -160,7 +186,7 @@ static bool uac_clock_source_is_valid(struct snd_usb_audio *chip,
         struct usb_device *dev = chip->dev;
         u32 bmControls;
  
-       if (protocol == UAC_VERSION_3) {
+       if (fmt->protocol == UAC_VERSION_3) {
                 struct uac3_clock_source_descriptor *cs_desc =
                         snd_usb_find_clock_source_v3(chip->ctrl_intf, source_id);
  
@@ -194,10 +220,14 @@ static bool uac_clock_source_is_valid(struct snd_usb_audio *chip,
                 return false;
         }
  
-       return data ? true :  false;
+       if (data)
+               return true;
+       else
+               return uac_clock_source_is_valid_quirk(chip, fmt, source_id);
  }
  
-static int __uac_clock_find_source(struct snd_usb_audio *chip, int entity_id,
+static int __uac_clock_find_source(struct snd_usb_audio *chip,
+                                  struct audioformat *fmt, int entity_id,
                                    unsigned long *visited, bool validate)
  {
         struct uac_clock_source_descriptor *source;
@@ -217,7 +247,7 @@ static int __uac_clock_find_source(struct snd_usb_audio *chip, int entity_id,
         source = snd_usb_find_clock_source(chip->ctrl_intf, entity_id);
         if (source) {
                 entity_id = source->bClockID;
-               if (validate && !uac_clock_source_is_valid(chip, UAC_VERSION_2,
+               if (validate && !uac_clock_source_is_valid(chip, fmt,
                                                                 entity_id)) {
                         usb_audio_err(chip,
                                 "clock source %d is not valid, cannot use\n",
@@ -248,8 +278,9 @@ static int __uac_clock_find_source(struct snd_usb_audio *chip, int entity_id,
                 }
  
                 cur = ret;
-               ret = __uac_clock_find_source(chip, selector->baCSourceID[ret - 1],
-                                              visited, validate);
+               ret = __uac_clock_find_source(chip, fmt,
+                                             selector->baCSourceID[ret - 1],
+                                             visited, validate);
                 if (!validate || ret > 0 || !chip->autoclock)
                         return ret;
  
@@ -260,8 +291,9 @@ static int __uac_clock_find_source(struct snd_usb_audio *chip, int entity_id,
                         if (i == cur)
                                 continue;
  
-                       ret = __uac_clock_find_source(chip, selector->baCSourceID[i - 1],
-                               visited, true);
+                       ret = __uac_clock_find_source(chip, fmt,
+                                                     selector->baCSourceID[i - 1],
+                                                     visited, true);
                         if (ret < 0)
                                 continue;
  
@@ -281,14 +313,16 @@ static int __uac_clock_find_source(struct snd_usb_audio *chip, int entity_id,
         /* FIXME: multipliers only act as pass-thru element for now */
         multiplier = snd_usb_find_clock_multiplier(chip->ctrl_intf, entity_id);
         if (multiplier)
-               return __uac_clock_find_source(chip, multiplier->bCSourceID,
-                                               visited, validate);
+               return __uac_clock_find_source(chip, fmt,
+                                              multiplier->bCSourceID,
+                                              visited, validate);
  
         return -EINVAL;
  }
  
-static int __uac3_clock_find_source(struct snd_usb_audio *chip, int entity_id,
-                                  unsigned long *visited, bool validate)
+static int __uac3_clock_find_source(struct snd_usb_audio *chip,
+                                   struct audioformat *fmt, int entity_id,
+                                   unsigned long *visited, bool validate)
  {
         struct uac3_clock_source_descriptor *source;
         struct uac3_clock_selector_descriptor *selector;
@@ -307,7 +341,7 @@ static int __uac3_clock_find_source(struct snd_usb_audio *chip, int entity_id,
         source = snd_usb_find_clock_source_v3(chip->ctrl_intf, entity_id);
         if (source) {
                 entity_id = source->bClockID;
-               if (validate && !uac_clock_source_is_valid(chip, UAC_VERSION_3,
+               if (validate && !uac_clock_source_is_valid(chip, fmt,
                                                                 entity_id)) {
                         usb_audio_err(chip,
                                 "clock source %d is not valid, cannot use\n",
@@ -338,7 +372,8 @@ static int __uac3_clock_find_source(struct snd_usb_audio *chip, int entity_id,
                 }
  
                 cur = ret;
-               ret = __uac3_clock_find_source(chip, selector->baCSourceID[ret - 1],
+               ret = __uac3_clock_find_source(chip, fmt,
+                                              selector->baCSourceID[ret - 1],
                                                visited, validate);
                 if (!validate || ret > 0 || !chip->autoclock)
                         return ret;
@@ -350,8 +385,9 @@ static int __uac3_clock_find_source(struct snd_usb_audio *chip, int entity_id,
                         if (i == cur)
                                 continue;
  
-                       ret = __uac3_clock_find_source(chip, selector->baCSourceID[i - 1],
-                               visited, true);
+                       ret = __uac3_clock_find_source(chip, fmt,
+                                                      selector->baCSourceID[i - 1],
+                                                      visited, true);
                         if (ret < 0)
                                 continue;
  
@@ -372,7 +408,8 @@ static int __uac3_clock_find_source(struct snd_usb_audio *chip, int entity_id,
         multiplier = snd_usb_find_clock_multiplier_v3(chip->ctrl_intf,
                                                       entity_id);
         if (multiplier)
-               return __uac3_clock_find_source(chip, multiplier->bCSourceID,
+               return __uac3_clock_find_source(chip, fmt,
+                                               multiplier->bCSourceID,
                                                 visited, validate);
  
         return -EINVAL;
@@ -389,18 +426,18 @@ static int __uac3_clock_find_source(struct snd_usb_audio *chip, int entity_id,
   *
   * Returns the clock source UnitID (>=0) on success, or an error.
   */
-int snd_usb_clock_find_source(struct snd_usb_audio *chip, int protocol,
-                             int entity_id, bool validate)
+int snd_usb_clock_find_source(struct snd_usb_audio *chip,
+                             struct audioformat *fmt, bool validate)
  {
         DECLARE_BITMAP(visited, 256);
         memset(visited, 0, sizeof(visited));
  
-       switch (protocol) {
+       switch (fmt->protocol) {
         case UAC_VERSION_2:
-               return __uac_clock_find_source(chip, entity_id, visited,
+               return __uac_clock_find_source(chip, fmt, fmt->clock, visited,
                                                validate);
         case UAC_VERSION_3:
-               return __uac3_clock_find_source(chip, entity_id, visited,
+               return __uac3_clock_find_source(chip, fmt, fmt->clock, visited,
                                                validate);
         default:
                 return -EINVAL;
@@ -501,8 +538,7 @@ static int set_sample_rate_v2v3(struct snd_usb_audio *chip, int iface,
          * automatic clock selection if the current clock is not
          * valid.
          */
-       clock = snd_usb_clock_find_source(chip, fmt->protocol,
-                                         fmt->clock, true);
+       clock = snd_usb_clock_find_source(chip, fmt, true);
         if (clock < 0) {
                 /* We did not find a valid clock, but that might be
                  * because the current sample rate does not match an
@@ -510,8 +546,7 @@ static int set_sample_rate_v2v3(struct snd_usb_audio *chip, int iface,
                  * and we will do another validation after setting the
                  * rate.
                  */
-               clock = snd_usb_clock_find_source(chip, fmt->protocol,
-                                                 fmt->clock, false);
+               clock = snd_usb_clock_find_source(chip, fmt, false);
                 if (clock < 0)
                         return clock;
         }
@@ -577,7 +612,7 @@ static int set_sample_rate_v2v3(struct snd_usb_audio *chip, int iface,
  
  validation:
         /* validate clock after rate change */
-       if (!uac_clock_source_is_valid(chip, fmt->protocol, clock))
+       if (!uac_clock_source_is_valid(chip, fmt, clock))
                 return -ENXIO;
         return 0;
  }
diff --git a/sound/usb/clock.h b/sound/usb/clock.h

index 076e31b79ee0ff69a137d8367af1041d12686e66..68df0fbe09d0073c681dd521a8164ab15e518d7d 100644 (file)
--- a/sound/usb/clock.h
+++ b/sound/usb/clock.h
@@ -6,7 +6,7 @@ int snd_usb_init_sample_rate(struct snd_usb_audio *chip, int iface,
                              struct usb_host_interface *alts,
                              struct audioformat *fmt, int rate);
  
-int snd_usb_clock_find_source(struct snd_usb_audio *chip, int protocol,
-                            int entity_id, bool validate);
+int snd_usb_clock_find_source(struct snd_usb_audio *chip,
+                             struct audioformat *fmt, bool validate);
  
  #endif /* __USBAUDIO_CLOCK_H */
diff --git a/sound/usb/format.c b/sound/usb/format.c

index 9260136e4c9bb77dd3b79562c842e921ad469c4c..9f5cb4ed3a0c4466b24f59b3d59beff216c62947 100644 (file)
--- a/sound/usb/format.c
+++ b/sound/usb/format.c
@@ -151,6 +151,19 @@ static u64 parse_audio_format_i_type(struct snd_usb_audio *chip,
         return pcm_formats;
  }
  
+static int set_fixed_rate(struct audioformat *fp, int rate, int rate_bits)
+{
+       kfree(fp->rate_table);
+       fp->rate_table = kmalloc(sizeof(int), GFP_KERNEL);
+       if (!fp->rate_table)
+               return -ENOMEM;
+       fp->nr_rates = 1;
+       fp->rate_min = rate;
+       fp->rate_max = rate;
+       fp->rates = rate_bits;
+       fp->rate_table[0] = rate;
+       return 0;
+}
  
  /*
   * parse the format descriptor and stores the possible sample rates
@@ -223,6 +236,14 @@ static int parse_audio_format_rates_v1(struct snd_usb_audio *chip, struct audiof
                 fp->rate_min = combine_triple(&fmt[offset + 1]);
                 fp->rate_max = combine_triple(&fmt[offset + 4]);
         }
+
+       /* Jabra Evolve 65 headset */
+       if (chip->usb_id == USB_ID(0x0b0e, 0x030b)) {
+               /* only 48kHz for playback while keeping 16kHz for capture */
+               if (fp->nr_rates != 1)
+                       return set_fixed_rate(fp, 48000, SNDRV_PCM_RATE_48000);
+       }
+
         return 0;
  }
  
@@ -299,17 +320,7 @@ static int line6_parse_audio_format_rates_quirk(struct snd_usb_audio *chip,
         case USB_ID(0x0e41, 0x4248): /* Line6 Helix >= fw 2.82 */
         case USB_ID(0x0e41, 0x4249): /* Line6 Helix Rack >= fw 2.82 */
         case USB_ID(0x0e41, 0x424a): /* Line6 Helix LT >= fw 2.82 */
-               /* supported rates: 48Khz */
-               kfree(fp->rate_table);
-               fp->rate_table = kmalloc(sizeof(int), GFP_KERNEL);
-               if (!fp->rate_table)
-                       return -ENOMEM;
-               fp->nr_rates = 1;
-               fp->rate_min = 48000;
-               fp->rate_max = 48000;
-               fp->rates = SNDRV_PCM_RATE_48000;
-               fp->rate_table[0] = 48000;
-               return 0;
+               return set_fixed_rate(fp, 48000, SNDRV_PCM_RATE_48000);
         }
  
         return -ENODEV;
@@ -325,8 +336,7 @@ static int parse_audio_format_rates_v2v3(struct snd_usb_audio *chip,
         struct usb_device *dev = chip->dev;
         unsigned char tmp[2], *data;
         int nr_triplets, data_size, ret = 0, ret_l6;
-       int clock = snd_usb_clock_find_source(chip, fp->protocol,
-                                             fp->clock, false);
+       int clock = snd_usb_clock_find_source(chip, fp, false);
  
         if (clock < 0) {
                 dev_err(&dev->dev,
diff --git a/sound/usb/mixer.c b/sound/usb/mixer.c

index d659fdb475e2bc373941df884b0404364cfb534b..81b2db0edd5f43bcbd8406f533c6732da7e84326 100644 (file)
--- a/sound/usb/mixer.c
+++ b/sound/usb/mixer.c
@@ -897,6 +897,15 @@ static int parse_term_proc_unit(struct mixer_build *state,
         return 0;
  }
  
+static int parse_term_effect_unit(struct mixer_build *state,
+                                 struct usb_audio_term *term,
+                                 void *p1, int id)
+{
+       term->type = UAC3_EFFECT_UNIT << 16; /* virtual type */
+       term->id = id;
+       return 0;
+}
+
  static int parse_term_uac2_clock_source(struct mixer_build *state,
                                         struct usb_audio_term *term,
                                         void *p1, int id)
@@ -981,8 +990,7 @@ static int __check_input_term(struct mixer_build *state, int id,
                                                     UAC3_PROCESSING_UNIT);
                 case PTYPE(UAC_VERSION_2, UAC2_EFFECT_UNIT):
                 case PTYPE(UAC_VERSION_3, UAC3_EFFECT_UNIT):
-                       return parse_term_proc_unit(state, term, p1, id,
-                                                   UAC3_EFFECT_UNIT);
+                       return parse_term_effect_unit(state, term, p1, id);
                 case PTYPE(UAC_VERSION_1, UAC1_EXTENSION_UNIT):
                 case PTYPE(UAC_VERSION_2, UAC2_EXTENSION_UNIT_V2):
                 case PTYPE(UAC_VERSION_3, UAC3_EXTENSION_UNIT):
diff --git a/sound/usb/quirks.c b/sound/usb/quirks.c

index 3a5242e383b24ad21652f98aa3396567417043ca..7f558f4b452045f07700f0dc95fea50746eb600b 100644 (file)
--- a/sound/usb/quirks.c
+++ b/sound/usb/quirks.c
@@ -1440,6 +1440,7 @@ bool snd_usb_get_sample_rate_quirk(struct snd_usb_audio *chip)
         case USB_ID(0x1395, 0x740a): /* Sennheiser DECT */
         case USB_ID(0x1901, 0x0191): /* GE B850V3 CP2114 audio interface */
         case USB_ID(0x21b4, 0x0081): /* AudioQuest DragonFly */
+       case USB_ID(0x2912, 0x30c8): /* Audioengine D1 */
                 return true;
         }
  
diff --git a/tools/arch/arm64/include/uapi/asm/kvm.h b/tools/arch/arm64/include/uapi/asm/kvm.h

index 820e5751ada71ab1ce0e08b90e026d4a9231dd36..ba85bb23f06017b4ed401da32167c057d526b1d0 100644 (file)
--- a/tools/arch/arm64/include/uapi/asm/kvm.h
+++ b/tools/arch/arm64/include/uapi/asm/kvm.h
@@ -220,10 +220,18 @@ struct kvm_vcpu_events {
  #define KVM_REG_ARM_PTIMER_CVAL                ARM64_SYS_REG(3, 3, 14, 2, 2)
  #define KVM_REG_ARM_PTIMER_CNT         ARM64_SYS_REG(3, 3, 14, 0, 1)
  
-/* EL0 Virtual Timer Registers */
+/*
+ * EL0 Virtual Timer Registers
+ *
+ * WARNING:
+ *      KVM_REG_ARM_TIMER_CVAL and KVM_REG_ARM_TIMER_CNT are not defined
+ *      with the appropriate register encodings.  Their values have been
+ *      accidentally swapped.  As this is set API, the definitions here
+ *      must be used, rather than ones derived from the encodings.
+ */
  #define KVM_REG_ARM_TIMER_CTL          ARM64_SYS_REG(3, 3, 14, 3, 1)
-#define KVM_REG_ARM_TIMER_CNT          ARM64_SYS_REG(3, 3, 14, 3, 2)
  #define KVM_REG_ARM_TIMER_CVAL         ARM64_SYS_REG(3, 3, 14, 0, 2)
+#define KVM_REG_ARM_TIMER_CNT          ARM64_SYS_REG(3, 3, 14, 3, 2)
  
  /* KVM-as-firmware specific pseudo-registers */
  #define KVM_REG_ARM_FW                 (0x0014 << KVM_REG_ARM_COPROC_SHIFT)
diff --git a/tools/arch/arm64/include/uapi/asm/unistd.h b/tools/arch/arm64/include/uapi/asm/unistd.h

index 4703d218663a2ad81e7c8d4fd0749bed8199ef4f..f83a70e07df85ca5029a1e91cde93b8e0dd9fb7e 100644 (file)
--- a/tools/arch/arm64/include/uapi/asm/unistd.h
+++ b/tools/arch/arm64/include/uapi/asm/unistd.h
@@ -19,5 +19,6 @@
  #define __ARCH_WANT_NEW_STAT
  #define __ARCH_WANT_SET_GET_RLIMIT
  #define __ARCH_WANT_TIME32_SYSCALLS
+#define __ARCH_WANT_SYS_CLONE3
  
  #include <asm-generic/unistd.h>
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h

index e9b62498fe75a3f3fce3692678e5ff28bbb28880..f3327cb56edfe163d1a8fc0a45b89fb324573243 100644 (file)
--- a/tools/arch/x86/include/asm/cpufeatures.h
+++ b/tools/arch/x86/include/asm/cpufeatures.h
@@ -220,6 +220,7 @@
  #define X86_FEATURE_ZEN                        ( 7*32+28) /* "" CPU is AMD family 0x17 (Zen) */
  #define X86_FEATURE_L1TF_PTEINV                ( 7*32+29) /* "" L1TF workaround PTE inversion */
  #define X86_FEATURE_IBRS_ENHANCED      ( 7*32+30) /* Enhanced IBRS */
+#define X86_FEATURE_MSR_IA32_FEAT_CTL  ( 7*32+31) /* "" MSR IA32_FEAT_CTL configured */
  
  /* Virtualization flags: Linux defined, word 8 */
  #define X86_FEATURE_TPR_SHADOW         ( 8*32+ 0) /* Intel TPR Shadow */
@@ -357,6 +358,7 @@
  /* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */
  #define X86_FEATURE_AVX512_4VNNIW      (18*32+ 2) /* AVX-512 Neural Network Instructions */
  #define X86_FEATURE_AVX512_4FMAPS      (18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */
+#define X86_FEATURE_FSRM               (18*32+ 4) /* Fast Short Rep Mov */
  #define X86_FEATURE_AVX512_VP2INTERSECT (18*32+ 8) /* AVX-512 Intersect for D/Q */
  #define X86_FEATURE_MD_CLEAR           (18*32+10) /* VERW clears CPU buffers */
  #define X86_FEATURE_TSX_FORCE_ABORT    (18*32+13) /* "" TSX_FORCE_ABORT */
diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x86/include/asm/disabled-features.h

index 8e1d0bb463611026f80589660c7ea16c850fecf1..4ea8584682f9982662e34db1f9df77377f4a3662 100644 (file)
--- a/tools/arch/x86/include/asm/disabled-features.h
+++ b/tools/arch/x86/include/asm/disabled-features.h
@@ -10,12 +10,6 @@
   * cpu_feature_enabled().
   */
  
-#ifdef CONFIG_X86_INTEL_MPX
-# define DISABLE_MPX   0
-#else
-# define DISABLE_MPX   (1<<(X86_FEATURE_MPX & 31))
-#endif
-
  #ifdef CONFIG_X86_SMAP
  # define DISABLE_SMAP  0
  #else
@@ -74,7 +68,7 @@
  #define DISABLED_MASK6 0
  #define DISABLED_MASK7 (DISABLE_PTI)
  #define DISABLED_MASK8 0
-#define DISABLED_MASK9 (DISABLE_MPX|DISABLE_SMAP)
+#define DISABLED_MASK9 (DISABLE_SMAP)
  #define DISABLED_MASK10        0
  #define DISABLED_MASK11        0
  #define DISABLED_MASK12        0
diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h

index ebe1685e92dda2bfd6795b45a92924de8a8f9451..d5e517d1c3ddc5c9ac6e594500d87524168b143d 100644 (file)
--- a/tools/arch/x86/include/asm/msr-index.h
+++ b/tools/arch/x86/include/asm/msr-index.h
@@ -512,6 +512,8 @@
  #define MSR_K7_HWCR                    0xc0010015
  #define MSR_K7_HWCR_SMMLOCK_BIT                0
  #define MSR_K7_HWCR_SMMLOCK            BIT_ULL(MSR_K7_HWCR_SMMLOCK_BIT)
+#define MSR_K7_HWCR_IRPERF_EN_BIT      30
+#define MSR_K7_HWCR_IRPERF_EN          BIT_ULL(MSR_K7_HWCR_IRPERF_EN_BIT)
  #define MSR_K7_FID_VID_CTL             0xc0010041
  #define MSR_K7_FID_VID_STATUS          0xc0010042
  
diff --git a/tools/arch/x86/include/uapi/asm/kvm.h b/tools/arch/x86/include/uapi/asm/kvm.h

index 503d3f42da1676791d2c4f4a70bfad35743daf4c..3f3f780c8c6500e1a1ea52bc0585af93699572fe 100644 (file)
--- a/tools/arch/x86/include/uapi/asm/kvm.h
+++ b/tools/arch/x86/include/uapi/asm/kvm.h
@@ -390,6 +390,7 @@ struct kvm_sync_regs {
  #define KVM_STATE_NESTED_GUEST_MODE    0x00000001
  #define KVM_STATE_NESTED_RUN_PENDING   0x00000002
  #define KVM_STATE_NESTED_EVMCS         0x00000004
+#define KVM_STATE_NESTED_MTF_PENDING   0x00000008
  
  #define KVM_STATE_NESTED_SMM_GUEST_MODE        0x00000001
  #define KVM_STATE_NESTED_SMM_VMXON     0x00000002
diff --git a/tools/bootconfig/include/linux/memblock.h b/tools/bootconfig/include/linux/memblock.h

new file mode 100644 (file)

index 0000000..7862f21
--- /dev/null
+++ b/tools/bootconfig/include/linux/memblock.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _XBC_LINUX_MEMBLOCK_H
+#define _XBC_LINUX_MEMBLOCK_H
+
+#include <stdlib.h>
+
+#define __pa(addr)     (addr)
+#define SMP_CACHE_BYTES        0
+#define memblock_alloc(size, align)    malloc(size)
+#define memblock_free(paddr, size)     free(paddr)
+
+#endif
diff --git a/tools/bootconfig/include/linux/printk.h b/tools/bootconfig/include/linux/printk.h

index 017bcd6912a5cbc0c830e001abed2f9e17f2863e..036e667596eb17c383e46cdc331961b9ed3c2ab2 100644 (file)
--- a/tools/bootconfig/include/linux/printk.h
+++ b/tools/bootconfig/include/linux/printk.h
@@ -4,10 +4,7 @@
  
  #include <stdio.h>
  
-/* controllable printf */
-extern int pr_output;
-#define printk(fmt, ...)       \
-       (pr_output ? printf(fmt, __VA_ARGS__) : 0)
+#define printk(fmt, ...) printf(fmt, ##__VA_ARGS__)
  
  #define pr_err printk
  #define pr_warn        printk
diff --git a/tools/bootconfig/main.c b/tools/bootconfig/main.c

index 47f4884583289debf08d2033276a45e4f7e62d71..a9b97814d1a937b139893076fbdcb47b9bd05d50 100644 (file)
--- a/tools/bootconfig/main.c
+++ b/tools/bootconfig/main.c
@@ -14,8 +14,6 @@
  #include <linux/kernel.h>
  #include <linux/bootconfig.h>
  
-int pr_output = 1;
-
  static int xbc_show_array(struct xbc_node *node)
  {
         const char *val;
@@ -131,16 +129,27 @@ int load_xbc_from_initrd(int fd, char **buf)
         struct stat stat;
         int ret;
         u32 size = 0, csum = 0, rcsum;
+       char magic[BOOTCONFIG_MAGIC_LEN];
  
         ret = fstat(fd, &stat);
         if (ret < 0)
                 return -errno;
  
-       if (stat.st_size < 8)
+       if (stat.st_size < 8 + BOOTCONFIG_MAGIC_LEN)
                 return 0;
  
-       if (lseek(fd, -8, SEEK_END) < 0) {
-               printf("Failed to lseek: %d\n", -errno);
+       if (lseek(fd, -BOOTCONFIG_MAGIC_LEN, SEEK_END) < 0) {
+               pr_err("Failed to lseek: %d\n", -errno);
+               return -errno;
+       }
+       if (read(fd, magic, BOOTCONFIG_MAGIC_LEN) < 0)
+               return -errno;
+       /* Check the bootconfig magic bytes */
+       if (memcmp(magic, BOOTCONFIG_MAGIC, BOOTCONFIG_MAGIC_LEN) != 0)
+               return 0;
+
+       if (lseek(fd, -(8 + BOOTCONFIG_MAGIC_LEN), SEEK_END) < 0) {
+               pr_err("Failed to lseek: %d\n", -errno);
                 return -errno;
         }
  
@@ -150,12 +159,15 @@ int load_xbc_from_initrd(int fd, char **buf)
         if (read(fd, &csum, sizeof(u32)) < 0)
                 return -errno;
  
-       /* Wrong size, maybe no boot config here */
-       if (stat.st_size < size + 8)
-               return 0;
+       /* Wrong size error  */
+       if (stat.st_size < size + 8 + BOOTCONFIG_MAGIC_LEN) {
+               pr_err("bootconfig size is too big\n");
+               return -E2BIG;
+       }
  
-       if (lseek(fd, stat.st_size - 8 - size, SEEK_SET) < 0) {
-               printf("Failed to lseek: %d\n", -errno);
+       if (lseek(fd, stat.st_size - (size + 8 + BOOTCONFIG_MAGIC_LEN),
+                 SEEK_SET) < 0) {
+               pr_err("Failed to lseek: %d\n", -errno);
                 return -errno;
         }
  
@@ -163,17 +175,17 @@ int load_xbc_from_initrd(int fd, char **buf)
         if (ret < 0)
                 return ret;
  
-       /* Wrong Checksum, maybe no boot config here */
+       /* Wrong Checksum */
         rcsum = checksum((unsigned char *)*buf, size);
         if (csum != rcsum) {
-               printf("checksum error: %d != %d\n", csum, rcsum);
-               return 0;
+               pr_err("checksum error: %d != %d\n", csum, rcsum);
+               return -EINVAL;
         }
  
         ret = xbc_init(*buf);
-       /* Wrong data, maybe no boot config here */
+       /* Wrong data */
         if (ret < 0)
-               return 0;
+               return ret;
  
         return size;
  }
@@ -185,13 +197,13 @@ int show_xbc(const char *path)
  
         fd = open(path, O_RDONLY);
         if (fd < 0) {
-               printf("Failed to open initrd %s: %d\n", path, fd);
+               pr_err("Failed to open initrd %s: %d\n", path, fd);
                 return -errno;
         }
  
         ret = load_xbc_from_initrd(fd, &buf);
         if (ret < 0)
-               printf("Failed to load a boot config from initrd: %d\n", ret);
+               pr_err("Failed to load a boot config from initrd: %d\n", ret);
         else
                 xbc_show_compact_tree();
  
@@ -209,24 +221,19 @@ int delete_xbc(const char *path)
  
         fd = open(path, O_RDWR);
         if (fd < 0) {
-               printf("Failed to open initrd %s: %d\n", path, fd);
+               pr_err("Failed to open initrd %s: %d\n", path, fd);
                 return -errno;
         }
  
-       /*
-        * Suppress error messages in xbc_init() because it can be just a
-        * data which concidentally matches the size and checksum footer.
-        */
-       pr_output = 0;
         size = load_xbc_from_initrd(fd, &buf);
-       pr_output = 1;
         if (size < 0) {
                 ret = size;
-               printf("Failed to load a boot config from initrd: %d\n", ret);
+               pr_err("Failed to load a boot config from initrd: %d\n", ret);
         } else if (size > 0) {
                 ret = fstat(fd, &stat);
                 if (!ret)
-                       ret = ftruncate(fd, stat.st_size - size - 8);
+                       ret = ftruncate(fd, stat.st_size
+                                       - size - 8 - BOOTCONFIG_MAGIC_LEN);
                 if (ret)
                         ret = -errno;
         } /* Ignore if there is no boot config in initrd */
@@ -245,7 +252,7 @@ int apply_xbc(const char *path, const char *xbc_path)
  
         ret = load_xbc_file(xbc_path, &buf);
         if (ret < 0) {
-               printf("Failed to load %s : %d\n", xbc_path, ret);
+               pr_err("Failed to load %s : %d\n", xbc_path, ret);
                 return ret;
         }
         size = strlen(buf) + 1;
@@ -262,7 +269,7 @@ int apply_xbc(const char *path, const char *xbc_path)
         /* Check the data format */
         ret = xbc_init(buf);
         if (ret < 0) {
-               printf("Failed to parse %s: %d\n", xbc_path, ret);
+               pr_err("Failed to parse %s: %d\n", xbc_path, ret);
                 free(data);
                 free(buf);
                 return ret;
@@ -279,20 +286,26 @@ int apply_xbc(const char *path, const char *xbc_path)
         /* Remove old boot config if exists */
         ret = delete_xbc(path);
         if (ret < 0) {
-               printf("Failed to delete previous boot config: %d\n", ret);
+               pr_err("Failed to delete previous boot config: %d\n", ret);
                 return ret;
         }
  
         /* Apply new one */
         fd = open(path, O_RDWR | O_APPEND);
         if (fd < 0) {
-               printf("Failed to open %s: %d\n", path, fd);
+               pr_err("Failed to open %s: %d\n", path, fd);
                 return fd;
         }
         /* TODO: Ensure the @path is initramfs/initrd image */
         ret = write(fd, data, size + 8);
         if (ret < 0) {
-               printf("Failed to apply a boot config: %d\n", ret);
+               pr_err("Failed to apply a boot config: %d\n", ret);
+               return ret;
+       }
+       /* Write a magic word of the bootconfig */
+       ret = write(fd, BOOTCONFIG_MAGIC, BOOTCONFIG_MAGIC_LEN);
+       if (ret < 0) {
+               pr_err("Failed to apply a boot config magic: %d\n", ret);
                 return ret;
         }
         close(fd);
@@ -334,12 +347,12 @@ int main(int argc, char **argv)
         }
  
         if (apply && delete) {
-               printf("Error: You can not specify both -a and -d at once.\n");
+               pr_err("Error: You can not specify both -a and -d at once.\n");
                 return usage();
         }
  
         if (optind >= argc) {
-               printf("Error: No initrd is specified.\n");
+               pr_err("Error: No initrd is specified.\n");
                 return usage();
         }
  
diff --git a/tools/bootconfig/samples/bad-mixed-kv1.bconf b/tools/bootconfig/samples/bad-mixed-kv1.bconf

new file mode 100644 (file)

index 0000000..1761547
--- /dev/null
+++ b/tools/bootconfig/samples/bad-mixed-kv1.bconf
@@ -0,0 +1,3 @@
+# value -> subkey pattern
+key = value
+key.subkey = another-value
diff --git a/tools/bootconfig/samples/bad-mixed-kv2.bconf b/tools/bootconfig/samples/bad-mixed-kv2.bconf

new file mode 100644 (file)

index 0000000..6b32e0c
--- /dev/null
+++ b/tools/bootconfig/samples/bad-mixed-kv2.bconf
@@ -0,0 +1,3 @@
+# subkey -> value pattern
+key.subkey = value
+key = another-value
diff --git a/tools/bootconfig/samples/bad-samekey.bconf b/tools/bootconfig/samples/bad-samekey.bconf

new file mode 100644 (file)

index 0000000..e8d983a
--- /dev/null
+++ b/tools/bootconfig/samples/bad-samekey.bconf
@@ -0,0 +1,6 @@
+# Same key value is not allowed
+key {
+       foo = value
+       bar = value2
+}
+key.foo = value
diff --git a/tools/bootconfig/test-bootconfig.sh b/tools/bootconfig/test-bootconfig.sh

index 87725e8723f87c74f5f1b5340f2e08ce47b3e160..1411f4c3454fbdb55cb1e290f71fc9849fa931ce 100755 (executable)
--- a/tools/bootconfig/test-bootconfig.sh
+++ b/tools/bootconfig/test-bootconfig.sh
@@ -9,7 +9,7 @@ TEMPCONF=`mktemp temp-XXXX.bconf`
  NG=0
  
  cleanup() {
-  rm -f $INITRD $TEMPCONF
+  rm -f $INITRD $TEMPCONF $OUTFILE
    exit $NG
  }
  
@@ -49,7 +49,7 @@ xpass $BOOTCONF -a $TEMPCONF $INITRD
  new_size=$(stat -c %s $INITRD)
  
  echo "File size check"
-xpass test $new_size -eq $(expr $bconf_size + $initrd_size + 9)
+xpass test $new_size -eq $(expr $bconf_size + $initrd_size + 9 + 12)
  
  echo "Apply command repeat test"
  xpass $BOOTCONF -a $TEMPCONF $INITRD
@@ -64,6 +64,14 @@ echo "File size check"
  new_size=$(stat -c %s $INITRD)
  xpass test $new_size -eq $initrd_size
  
+echo "No error messge while applying"
+OUTFILE=`mktemp tempout-XXXX`
+dd if=/dev/zero of=$INITRD bs=4096 count=1
+printf " \0\0\0 \0\0\0" >> $INITRD
+$BOOTCONF -a $TEMPCONF $INITRD > $OUTFILE 2>&1
+xfail grep -i "failed" $OUTFILE
+xfail grep -i "error" $OUTFILE
+
  echo "Max node number check"
  
  echo -n > $TEMPCONF
@@ -87,6 +95,19 @@ truncate -s 32764 $TEMPCONF
  echo "\"" >> $TEMPCONF # add 2 bytes + terminal ('\"\n\0')
  xpass $BOOTCONF -a $TEMPCONF $INITRD
  
+echo "Adding same-key values"
+cat > $TEMPCONF << EOF
+key = bar, baz
+key += qux
+EOF
+echo > $INITRD
+
+xpass $BOOTCONF -a $TEMPCONF $INITRD
+$BOOTCONF $INITRD > $OUTFILE
+xpass grep -q "bar" $OUTFILE
+xpass grep -q "baz" $OUTFILE
+xpass grep -q "qux" $OUTFILE
+
  echo "=== expected failure cases ==="
  for i in samples/bad-* ; do
    xfail $BOOTCONF -a $i $INITRD
diff --git a/tools/include/uapi/asm-generic/mman-common.h b/tools/include/uapi/asm-generic/mman-common.h

index c160a5354eb62b3b17de564be439451c812470ae..f94f65d429bea3c26bdcdc3197376916399089e9 100644 (file)
--- a/tools/include/uapi/asm-generic/mman-common.h
+++ b/tools/include/uapi/asm-generic/mman-common.h
@@ -11,6 +11,8 @@
  #define PROT_WRITE     0x2             /* page can be written */
  #define PROT_EXEC      0x4             /* page can be executed */
  #define PROT_SEM       0x8             /* page may be used for atomic ops */
+/*                     0x10               reserved for arch-specific use */
+/*                     0x20               reserved for arch-specific use */
  #define PROT_NONE      0x0             /* page can not be accessed */
  #define PROT_GROWSDOWN 0x01000000      /* mprotect flag: extend change to start of growsdown vma */
  #define PROT_GROWSUP   0x02000000      /* mprotect flag: extend change to end of growsup vma */
diff --git a/tools/include/uapi/asm-generic/unistd.h b/tools/include/uapi/asm-generic/unistd.h

index 1fc8faa6e97306dfa95335ecba91b3777a843aa9..3a3201e4618ef8c7445895b26f6eebbaea1574f9 100644 (file)
--- a/tools/include/uapi/asm-generic/unistd.h
+++ b/tools/include/uapi/asm-generic/unistd.h
@@ -851,8 +851,13 @@ __SYSCALL(__NR_pidfd_open, sys_pidfd_open)
  __SYSCALL(__NR_clone3, sys_clone3)
  #endif
  
+#define __NR_openat2 437
+__SYSCALL(__NR_openat2, sys_openat2)
+#define __NR_pidfd_getfd 438
+__SYSCALL(__NR_pidfd_getfd, sys_pidfd_getfd)
+
  #undef __NR_syscalls
-#define __NR_syscalls 436
+#define __NR_syscalls 439
  
  /*
   * 32 bit systems traditionally used different
diff --git a/tools/include/uapi/drm/i915_drm.h b/tools/include/uapi/drm/i915_drm.h

index 5400d7e057f143abdeb124c229ccfe7cdeb58196..829c0a48577f8b942b7efbf4d7a6438c7d72a670 100644 (file)
--- a/tools/include/uapi/drm/i915_drm.h
+++ b/tools/include/uapi/drm/i915_drm.h
@@ -395,6 +395,7 @@ typedef struct _drm_i915_sarea {
  #define DRM_IOCTL_I915_GEM_PWRITE      DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_PWRITE, struct drm_i915_gem_pwrite)
  #define DRM_IOCTL_I915_GEM_MMAP                DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_MMAP, struct drm_i915_gem_mmap)
  #define DRM_IOCTL_I915_GEM_MMAP_GTT    DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_MMAP_GTT, struct drm_i915_gem_mmap_gtt)
+#define DRM_IOCTL_I915_GEM_MMAP_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_MMAP_GTT, struct drm_i915_gem_mmap_offset)
  #define DRM_IOCTL_I915_GEM_SET_DOMAIN  DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_SET_DOMAIN, struct drm_i915_gem_set_domain)
  #define DRM_IOCTL_I915_GEM_SW_FINISH   DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_SW_FINISH, struct drm_i915_gem_sw_finish)
  #define DRM_IOCTL_I915_GEM_SET_TILING  DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_SET_TILING, struct drm_i915_gem_set_tiling)
@@ -793,6 +794,37 @@ struct drm_i915_gem_mmap_gtt {
         __u64 offset;
  };
  
+struct drm_i915_gem_mmap_offset {
+       /** Handle for the object being mapped. */
+       __u32 handle;
+       __u32 pad;
+       /**
+        * Fake offset to use for subsequent mmap call
+        *
+        * This is a fixed-size type for 32/64 compatibility.
+        */
+       __u64 offset;
+
+       /**
+        * Flags for extended behaviour.
+        *
+        * It is mandatory that one of the MMAP_OFFSET types
+        * (GTT, WC, WB, UC, etc) should be included.
+        */
+       __u64 flags;
+#define I915_MMAP_OFFSET_GTT 0
+#define I915_MMAP_OFFSET_WC  1
+#define I915_MMAP_OFFSET_WB  2
+#define I915_MMAP_OFFSET_UC  3
+
+       /*
+        * Zero-terminated chain of extensions.
+        *
+        * No current extensions defined; mbz.
+        */
+       __u64 extensions;
+};
+
  struct drm_i915_gem_set_domain {
         /** Handle for the object */
         __u32 handle;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h

index f1d74a2bd23493635afcbd6a7336c2fd2806f471..22f235260a3a352cf4223249fc1d26f2fce62f34 100644 (file)
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1045,9 +1045,9 @@ union bpf_attr {
   *             supports redirection to the egress interface, and accepts no
   *             flag at all.
   *
- *             The same effect can be attained with the more generic
- *             **bpf_redirect_map**\ (), which requires specific maps to be
- *             used but offers better performance.
+ *             The same effect can also be attained with the more generic
+ *             **bpf_redirect_map**\ (), which uses a BPF map to store the
+ *             redirect target instead of providing it directly to the helper.
   *     Return
   *             For XDP, the helper returns **XDP_REDIRECT** on success or
   *             **XDP_ABORTED** on error. For other program types, the values
@@ -1611,13 +1611,11 @@ union bpf_attr {
   *             the caller. Any higher bits in the *flags* argument must be
   *             unset.
   *
- *             When used to redirect packets to net devices, this helper
- *             provides a high performance increase over **bpf_redirect**\ ().
- *             This is due to various implementation details of the underlying
- *             mechanisms, one of which is the fact that **bpf_redirect_map**\
- *             () tries to send packet as a "bulk" to the device.
+ *             See also bpf_redirect(), which only supports redirecting to an
+ *             ifindex, but doesn't require a map to do so.
   *     Return
- *             **XDP_REDIRECT** on success, or **XDP_ABORTED** on error.
+ *             **XDP_REDIRECT** on success, or the value of the two lower bits
+ *             of the **flags* argument on error.
   *
   * int bpf_sk_redirect_map(struct sk_buff *skb, struct bpf_map *map, u32 key, u64 flags)
   *     Description
diff --git a/tools/include/uapi/linux/fcntl.h b/tools/include/uapi/linux/fcntl.h

index 1f97b33c840e09936fb92b6a13d82e333d0e8151..ca88b7bce55385b41203284196923d09250f2ea3 100644 (file)
--- a/tools/include/uapi/linux/fcntl.h
+++ b/tools/include/uapi/linux/fcntl.h
@@ -3,6 +3,7 @@
  #define _UAPI_LINUX_FCNTL_H
  
  #include <asm/fcntl.h>
+#include <linux/openat2.h>
  
  #define F_SETLEASE     (F_LINUX_SPECIFIC_BASE + 0)
  #define F_GETLEASE     (F_LINUX_SPECIFIC_BASE + 1)
@@ -100,5 +101,4 @@
  
  #define AT_RECURSIVE           0x8000  /* Apply to the entire subtree */
  
-
  #endif /* _UAPI_LINUX_FCNTL_H */
diff --git a/tools/include/uapi/linux/fscrypt.h b/tools/include/uapi/linux/fscrypt.h

index 1beb174ad9505634151c5ac2896ae63bcced028e..0d8a6f47711c32eef4701ad9d9d6936f3a01d9c9 100644 (file)
--- a/tools/include/uapi/linux/fscrypt.h
+++ b/tools/include/uapi/linux/fscrypt.h
@@ -8,6 +8,7 @@
  #ifndef _UAPI_LINUX_FSCRYPT_H
  #define _UAPI_LINUX_FSCRYPT_H
  
+#include <linux/ioctl.h>
  #include <linux/types.h>
  
  /* Encryption policy flags */
@@ -109,11 +110,22 @@ struct fscrypt_key_specifier {
         } u;
  };
  
+/*
+ * Payload of Linux keyring key of type "fscrypt-provisioning", referenced by
+ * fscrypt_add_key_arg::key_id as an alternative to fscrypt_add_key_arg::raw.
+ */
+struct fscrypt_provisioning_key_payload {
+       __u32 type;
+       __u32 __reserved;
+       __u8 raw[];
+};
+
  /* Struct passed to FS_IOC_ADD_ENCRYPTION_KEY */
  struct fscrypt_add_key_arg {
         struct fscrypt_key_specifier key_spec;
         __u32 raw_size;
-       __u32 __reserved[9];
+       __u32 key_id;
+       __u32 __reserved[8];
         __u8 raw[];
  };
  
diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h

index f0a16b4adbbd63c421006f6ca9b0fd9a892f7a5d..4b95f9a31a2f5e227f57f4cbba907c0508c5e3a9 100644 (file)
--- a/tools/include/uapi/linux/kvm.h
+++ b/tools/include/uapi/linux/kvm.h
@@ -1009,6 +1009,7 @@ struct kvm_ppc_resize_hpt {
  #define KVM_CAP_PPC_GUEST_DEBUG_SSTEP 176
  #define KVM_CAP_ARM_NISV_TO_USER 177
  #define KVM_CAP_ARM_INJECT_EXT_DABT 178
+#define KVM_CAP_S390_VCPU_RESETS 179
  
  #ifdef KVM_CAP_IRQ_ROUTING
  
@@ -1473,6 +1474,10 @@ struct kvm_enc_region {
  /* Available with KVM_CAP_ARM_SVE */
  #define KVM_ARM_VCPU_FINALIZE    _IOW(KVMIO,  0xc2, int)
  
+/* Available with  KVM_CAP_S390_VCPU_RESETS */
+#define KVM_S390_NORMAL_RESET  _IO(KVMIO,   0xc3)
+#define KVM_S390_CLEAR_RESET   _IO(KVMIO,   0xc4)
+
  /* Secure Encrypted Virtualization command */
  enum sev_cmd_id {
         /* Guest initialization commands */
diff --git a/tools/include/uapi/linux/openat2.h b/tools/include/uapi/linux/openat2.h

new file mode 100644 (file)

index 0000000..58b1eb7
--- /dev/null
+++ b/tools/include/uapi/linux/openat2.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _UAPI_LINUX_OPENAT2_H
+#define _UAPI_LINUX_OPENAT2_H
+
+#include <linux/types.h>
+
+/*
+ * Arguments for how openat2(2) should open the target path. If only @flags and
+ * @mode are non-zero, then openat2(2) operates very similarly to openat(2).
+ *
+ * However, unlike openat(2), unknown or invalid bits in @flags result in
+ * -EINVAL rather than being silently ignored. @mode must be zero unless one of
+ * {O_CREAT, O_TMPFILE} are set.
+ *
+ * @flags: O_* flags.
+ * @mode: O_CREAT/O_TMPFILE file mode.
+ * @resolve: RESOLVE_* flags.
+ */
+struct open_how {
+       __u64 flags;
+       __u64 mode;
+       __u64 resolve;
+};
+
+/* how->resolve flags for openat2(2). */
+#define RESOLVE_NO_XDEV                0x01 /* Block mount-point crossings
+                                       (includes bind-mounts). */
+#define RESOLVE_NO_MAGICLINKS  0x02 /* Block traversal through procfs-style
+                                       "magic-links". */
+#define RESOLVE_NO_SYMLINKS    0x04 /* Block traversal through all symlinks
+                                       (implies OEXT_NO_MAGICLINKS) */
+#define RESOLVE_BENEATH                0x08 /* Block "lexical" trickery like
+                                       "..", symlinks, and absolute
+                                       paths which escape the dirfd. */
+#define RESOLVE_IN_ROOT                0x10 /* Make all jumps to "/" and ".."
+                                       be scoped inside the dirfd
+                                       (similar to chroot(2)). */
+
+#endif /* _UAPI_LINUX_OPENAT2_H */
diff --git a/tools/include/uapi/linux/prctl.h b/tools/include/uapi/linux/prctl.h

index 7da1b37b27aa5b75fb89b79f0d9f193e5021a911..07b4f8131e362bdc815f37cea0c9067a9464f256 100644 (file)
--- a/tools/include/uapi/linux/prctl.h
+++ b/tools/include/uapi/linux/prctl.h
@@ -234,4 +234,8 @@ struct prctl_mm_map {
  #define PR_GET_TAGGED_ADDR_CTRL                56
  # define PR_TAGGED_ADDR_ENABLE         (1UL << 0)
  
+/* Control reclaim behavior when allocating memory */
+#define PR_SET_IO_FLUSHER              57
+#define PR_GET_IO_FLUSHER              58
+
  #endif /* _LINUX_PRCTL_H */
diff --git a/tools/include/uapi/linux/sched.h b/tools/include/uapi/linux/sched.h

index 4a02178324641f555336164316ba2e30805c3719..2e3bc22c6f202f6280ef7279de60fa966b84d3ed 100644 (file)
--- a/tools/include/uapi/linux/sched.h
+++ b/tools/include/uapi/linux/sched.h
@@ -36,6 +36,12 @@
  /* Flags for the clone3() syscall. */
  #define CLONE_CLEAR_SIGHAND 0x100000000ULL /* Clear any signal handler and reset to SIG_DFL. */
  
+/*
+ * cloning flags intersect with CSIGNAL so can be used with unshare and clone3
+ * syscalls only:
+ */
+#define CLONE_NEWTIME  0x00000080      /* New time namespace */
+
  #ifndef __ASSEMBLY__
  /**
   * struct clone_args - arguments for the clone3 syscall
diff --git a/tools/include/uapi/sound/asound.h b/tools/include/uapi/sound/asound.h

index df1153cea0b7ee2a27e19682837f81922fef353e..535a7229e1d94a706dd8f91d9fdfdf0e0f66caa4 100644 (file)
--- a/tools/include/uapi/sound/asound.h
+++ b/tools/include/uapi/sound/asound.h
@@ -26,7 +26,9 @@
  
  #if defined(__KERNEL__) || defined(__linux__)
  #include <linux/types.h>
+#include <asm/byteorder.h>
  #else
+#include <endian.h>
  #include <sys/ioctl.h>
  #endif
  
@@ -154,7 +156,7 @@ struct snd_hwdep_dsp_image {
   *                                                                           *
   *****************************************************************************/
  
-#define SNDRV_PCM_VERSION              SNDRV_PROTOCOL_VERSION(2, 0, 14)
+#define SNDRV_PCM_VERSION              SNDRV_PROTOCOL_VERSION(2, 0, 15)
  
  typedef unsigned long snd_pcm_uframes_t;
  typedef signed long snd_pcm_sframes_t;
@@ -301,7 +303,9 @@ typedef int __bitwise snd_pcm_subformat_t;
  #define SNDRV_PCM_INFO_DRAIN_TRIGGER   0x40000000              /* internal kernel flag - trigger in drain */
  #define SNDRV_PCM_INFO_FIFO_IN_FRAMES  0x80000000      /* internal kernel flag - FIFO size is in frames */
  
-
+#if (__BITS_PER_LONG == 32 && defined(__USE_TIME_BITS64)) || defined __KERNEL__
+#define __SND_STRUCT_TIME64
+#endif
  
  typedef int __bitwise snd_pcm_state_t;
  #define        SNDRV_PCM_STATE_OPEN            ((__force snd_pcm_state_t) 0) /* stream is open */
@@ -317,8 +321,17 @@ typedef int __bitwise snd_pcm_state_t;
  
  enum {
         SNDRV_PCM_MMAP_OFFSET_DATA = 0x00000000,
-       SNDRV_PCM_MMAP_OFFSET_STATUS = 0x80000000,
-       SNDRV_PCM_MMAP_OFFSET_CONTROL = 0x81000000,
+       SNDRV_PCM_MMAP_OFFSET_STATUS_OLD = 0x80000000,
+       SNDRV_PCM_MMAP_OFFSET_CONTROL_OLD = 0x81000000,
+       SNDRV_PCM_MMAP_OFFSET_STATUS_NEW = 0x82000000,
+       SNDRV_PCM_MMAP_OFFSET_CONTROL_NEW = 0x83000000,
+#ifdef __SND_STRUCT_TIME64
+       SNDRV_PCM_MMAP_OFFSET_STATUS = SNDRV_PCM_MMAP_OFFSET_STATUS_NEW,
+       SNDRV_PCM_MMAP_OFFSET_CONTROL = SNDRV_PCM_MMAP_OFFSET_CONTROL_NEW,
+#else
+       SNDRV_PCM_MMAP_OFFSET_STATUS = SNDRV_PCM_MMAP_OFFSET_STATUS_OLD,
+       SNDRV_PCM_MMAP_OFFSET_CONTROL = SNDRV_PCM_MMAP_OFFSET_CONTROL_OLD,
+#endif
  };
  
  union snd_pcm_sync_id {
@@ -456,8 +469,13 @@ enum {
         SNDRV_PCM_AUDIO_TSTAMP_TYPE_LAST = SNDRV_PCM_AUDIO_TSTAMP_TYPE_LINK_SYNCHRONIZED
  };
  
+#ifndef __KERNEL__
+/* explicit padding avoids incompatibility between i386 and x86-64 */
+typedef struct { unsigned char pad[sizeof(time_t) - sizeof(int)]; } __time_pad;
+
  struct snd_pcm_status {
         snd_pcm_state_t state;          /* stream state */
+       __time_pad pad1;                /* align to timespec */
         struct timespec trigger_tstamp; /* time when stream was started/stopped/paused */
         struct timespec tstamp;         /* reference timestamp */
         snd_pcm_uframes_t appl_ptr;     /* appl ptr */
@@ -473,17 +491,48 @@ struct snd_pcm_status {
         __u32 audio_tstamp_accuracy;    /* in ns units, only valid if indicated in audio_tstamp_data */
         unsigned char reserved[52-2*sizeof(struct timespec)]; /* must be filled with zero */
  };
+#endif
+
+/*
+ * For mmap operations, we need the 64-bit layout, both for compat mode,
+ * and for y2038 compatibility. For 64-bit applications, the two definitions
+ * are identical, so we keep the traditional version.
+ */
+#ifdef __SND_STRUCT_TIME64
+#define __snd_pcm_mmap_status64                snd_pcm_mmap_status
+#define __snd_pcm_mmap_control64       snd_pcm_mmap_control
+#define __snd_pcm_sync_ptr64           snd_pcm_sync_ptr
+#ifdef __KERNEL__
+#define __snd_timespec64               __kernel_timespec
+#else
+#define __snd_timespec64               timespec
+#endif
+struct __snd_timespec {
+       __s32 tv_sec;
+       __s32 tv_nsec;
+};
+#else
+#define __snd_pcm_mmap_status          snd_pcm_mmap_status
+#define __snd_pcm_mmap_control         snd_pcm_mmap_control
+#define __snd_pcm_sync_ptr             snd_pcm_sync_ptr
+#define __snd_timespec                 timespec
+struct __snd_timespec64 {
+       __s64 tv_sec;
+       __s64 tv_nsec;
+};
  
-struct snd_pcm_mmap_status {
+#endif
+
+struct __snd_pcm_mmap_status {
         snd_pcm_state_t state;          /* RO: state - SNDRV_PCM_STATE_XXXX */
         int pad1;                       /* Needed for 64 bit alignment */
         snd_pcm_uframes_t hw_ptr;       /* RO: hw ptr (0...boundary-1) */
-       struct timespec tstamp;         /* Timestamp */
+       struct __snd_timespec tstamp;   /* Timestamp */
         snd_pcm_state_t suspended_state; /* RO: suspended stream state */
-       struct timespec audio_tstamp;   /* from sample counter or wall clock */
+       struct __snd_timespec audio_tstamp; /* from sample counter or wall clock */
  };
  
-struct snd_pcm_mmap_control {
+struct __snd_pcm_mmap_control {
         snd_pcm_uframes_t appl_ptr;     /* RW: appl ptr (0...boundary-1) */
         snd_pcm_uframes_t avail_min;    /* RW: min available frames for wakeup */
  };
@@ -492,14 +541,59 @@ struct snd_pcm_mmap_control {
  #define SNDRV_PCM_SYNC_PTR_APPL                (1<<1)  /* get appl_ptr from driver (r/w op) */
  #define SNDRV_PCM_SYNC_PTR_AVAIL_MIN   (1<<2)  /* get avail_min from driver */
  
-struct snd_pcm_sync_ptr {
+struct __snd_pcm_sync_ptr {
         unsigned int flags;
         union {
-               struct snd_pcm_mmap_status status;
+               struct __snd_pcm_mmap_status status;
+               unsigned char reserved[64];
+       } s;
+       union {
+               struct __snd_pcm_mmap_control control;
+               unsigned char reserved[64];
+       } c;
+};
+
+#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN : defined(__BIG_ENDIAN)
+typedef char __pad_before_uframe[sizeof(__u64) - sizeof(snd_pcm_uframes_t)];
+typedef char __pad_after_uframe[0];
+#endif
+
+#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN : defined(__LITTLE_ENDIAN)
+typedef char __pad_before_uframe[0];
+typedef char __pad_after_uframe[sizeof(__u64) - sizeof(snd_pcm_uframes_t)];
+#endif
+
+struct __snd_pcm_mmap_status64 {
+       snd_pcm_state_t state;          /* RO: state - SNDRV_PCM_STATE_XXXX */
+       __u32 pad1;                     /* Needed for 64 bit alignment */
+       __pad_before_uframe __pad1;
+       snd_pcm_uframes_t hw_ptr;       /* RO: hw ptr (0...boundary-1) */
+       __pad_after_uframe __pad2;
+       struct __snd_timespec64 tstamp; /* Timestamp */
+       snd_pcm_state_t suspended_state;/* RO: suspended stream state */
+       __u32 pad3;                     /* Needed for 64 bit alignment */
+       struct __snd_timespec64 audio_tstamp; /* sample counter or wall clock */
+};
+
+struct __snd_pcm_mmap_control64 {
+       __pad_before_uframe __pad1;
+       snd_pcm_uframes_t appl_ptr;      /* RW: appl ptr (0...boundary-1) */
+       __pad_before_uframe __pad2;
+
+       __pad_before_uframe __pad3;
+       snd_pcm_uframes_t  avail_min;    /* RW: min available frames for wakeup */
+       __pad_after_uframe __pad4;
+};
+
+struct __snd_pcm_sync_ptr64 {
+       __u32 flags;
+       __u32 pad1;
+       union {
+               struct __snd_pcm_mmap_status64 status;
                 unsigned char reserved[64];
         } s;
         union {
-               struct snd_pcm_mmap_control control;
+               struct __snd_pcm_mmap_control64 control;
                 unsigned char reserved[64];
         } c;
  };
@@ -584,6 +678,8 @@ enum {
  #define SNDRV_PCM_IOCTL_STATUS         _IOR('A', 0x20, struct snd_pcm_status)
  #define SNDRV_PCM_IOCTL_DELAY          _IOR('A', 0x21, snd_pcm_sframes_t)
  #define SNDRV_PCM_IOCTL_HWSYNC         _IO('A', 0x22)
+#define __SNDRV_PCM_IOCTL_SYNC_PTR     _IOWR('A', 0x23, struct __snd_pcm_sync_ptr)
+#define __SNDRV_PCM_IOCTL_SYNC_PTR64   _IOWR('A', 0x23, struct __snd_pcm_sync_ptr64)
  #define SNDRV_PCM_IOCTL_SYNC_PTR       _IOWR('A', 0x23, struct snd_pcm_sync_ptr)
  #define SNDRV_PCM_IOCTL_STATUS_EXT     _IOWR('A', 0x24, struct snd_pcm_status)
  #define SNDRV_PCM_IOCTL_CHANNEL_INFO   _IOR('A', 0x32, struct snd_pcm_channel_info)
@@ -614,7 +710,7 @@ enum {
   *  Raw MIDI section - /dev/snd/midi??
   */
  
-#define SNDRV_RAWMIDI_VERSION          SNDRV_PROTOCOL_VERSION(2, 0, 0)
+#define SNDRV_RAWMIDI_VERSION          SNDRV_PROTOCOL_VERSION(2, 0, 1)
  
  enum {
         SNDRV_RAWMIDI_STREAM_OUTPUT = 0,
@@ -648,13 +744,16 @@ struct snd_rawmidi_params {
         unsigned char reserved[16];     /* reserved for future use */
  };
  
+#ifndef __KERNEL__
  struct snd_rawmidi_status {
         int stream;
+       __time_pad pad1;
         struct timespec tstamp;         /* Timestamp */
         size_t avail;                   /* available bytes */
         size_t xruns;                   /* count of overruns since last status (in bytes) */
         unsigned char reserved[16];     /* reserved for future use */
  };
+#endif
  
  #define SNDRV_RAWMIDI_IOCTL_PVERSION   _IOR('W', 0x00, int)
  #define SNDRV_RAWMIDI_IOCTL_INFO       _IOR('W', 0x01, struct snd_rawmidi_info)
@@ -667,7 +766,7 @@ struct snd_rawmidi_status {
   *  Timer section - /dev/snd/timer
   */
  
-#define SNDRV_TIMER_VERSION            SNDRV_PROTOCOL_VERSION(2, 0, 6)
+#define SNDRV_TIMER_VERSION            SNDRV_PROTOCOL_VERSION(2, 0, 7)
  
  enum {
         SNDRV_TIMER_CLASS_NONE = -1,
@@ -761,6 +860,7 @@ struct snd_timer_params {
         unsigned char reserved[60];     /* reserved */
  };
  
+#ifndef __KERNEL__
  struct snd_timer_status {
         struct timespec tstamp;         /* Timestamp - last update */
         unsigned int resolution;        /* current period resolution in ns */
@@ -769,10 +869,11 @@ struct snd_timer_status {
         unsigned int queue;             /* used queue size */
         unsigned char reserved[64];     /* reserved */
  };
+#endif
  
  #define SNDRV_TIMER_IOCTL_PVERSION     _IOR('T', 0x00, int)
  #define SNDRV_TIMER_IOCTL_NEXT_DEVICE  _IOWR('T', 0x01, struct snd_timer_id)
-#define SNDRV_TIMER_IOCTL_TREAD                _IOW('T', 0x02, int)
+#define SNDRV_TIMER_IOCTL_TREAD_OLD    _IOW('T', 0x02, int)
  #define SNDRV_TIMER_IOCTL_GINFO                _IOWR('T', 0x03, struct snd_timer_ginfo)
  #define SNDRV_TIMER_IOCTL_GPARAMS      _IOW('T', 0x04, struct snd_timer_gparams)
  #define SNDRV_TIMER_IOCTL_GSTATUS      _IOWR('T', 0x05, struct snd_timer_gstatus)
@@ -785,6 +886,15 @@ struct snd_timer_status {
  #define SNDRV_TIMER_IOCTL_STOP         _IO('T', 0xa1)
  #define SNDRV_TIMER_IOCTL_CONTINUE     _IO('T', 0xa2)
  #define SNDRV_TIMER_IOCTL_PAUSE                _IO('T', 0xa3)
+#define SNDRV_TIMER_IOCTL_TREAD64      _IOW('T', 0xa4, int)
+
+#if __BITS_PER_LONG == 64
+#define SNDRV_TIMER_IOCTL_TREAD SNDRV_TIMER_IOCTL_TREAD_OLD
+#else
+#define SNDRV_TIMER_IOCTL_TREAD ((sizeof(__kernel_long_t) >= sizeof(time_t)) ? \
+                                SNDRV_TIMER_IOCTL_TREAD_OLD : \
+                                SNDRV_TIMER_IOCTL_TREAD64)
+#endif
  
  struct snd_timer_read {
         unsigned int resolution;
@@ -810,11 +920,15 @@ enum {
         SNDRV_TIMER_EVENT_MRESUME = SNDRV_TIMER_EVENT_RESUME + 10,
  };
  
+#ifndef __KERNEL__
  struct snd_timer_tread {
         int event;
+       __time_pad pad1;
         struct timespec tstamp;
         unsigned int val;
+       __time_pad pad2;
  };
+#endif
  
  /****************************************************************************
   *                                                                          *
@@ -822,7 +936,7 @@ struct snd_timer_tread {
   *                                                                          *
   ****************************************************************************/
  
-#define SNDRV_CTL_VERSION              SNDRV_PROTOCOL_VERSION(2, 0, 7)
+#define SNDRV_CTL_VERSION              SNDRV_PROTOCOL_VERSION(2, 0, 8)
  
  struct snd_ctl_card_info {
         int card;                       /* card number */
@@ -860,7 +974,7 @@ typedef int __bitwise snd_ctl_elem_iface_t;
  #define SNDRV_CTL_ELEM_ACCESS_WRITE            (1<<1)
  #define SNDRV_CTL_ELEM_ACCESS_READWRITE                (SNDRV_CTL_ELEM_ACCESS_READ|SNDRV_CTL_ELEM_ACCESS_WRITE)
  #define SNDRV_CTL_ELEM_ACCESS_VOLATILE         (1<<2)  /* control value may be changed without a notification */
-#define SNDRV_CTL_ELEM_ACCESS_TIMESTAMP                (1<<3)  /* when was control changed */
+// (1 << 3) is unused.
  #define SNDRV_CTL_ELEM_ACCESS_TLV_READ         (1<<4)  /* TLV read is possible */
  #define SNDRV_CTL_ELEM_ACCESS_TLV_WRITE                (1<<5)  /* TLV write is possible */
  #define SNDRV_CTL_ELEM_ACCESS_TLV_READWRITE    (SNDRV_CTL_ELEM_ACCESS_TLV_READ|SNDRV_CTL_ELEM_ACCESS_TLV_WRITE)
@@ -926,11 +1040,7 @@ struct snd_ctl_elem_info {
                 } enumerated;
                 unsigned char reserved[128];
         } value;
-       union {
-               unsigned short d[4];            /* dimensions */
-               unsigned short *d_ptr;          /* indirect - obsoleted */
-       } dimen;
-       unsigned char reserved[64-4*sizeof(unsigned short)];
+       unsigned char reserved[64];
  };
  
  struct snd_ctl_elem_value {
@@ -955,8 +1065,7 @@ struct snd_ctl_elem_value {
                 } bytes;
                 struct snd_aes_iec958 iec958;
         } value;                /* RO */
-       struct timespec tstamp;
-       unsigned char reserved[128-sizeof(struct timespec)];
+       unsigned char reserved[128];
  };
  
  struct snd_ctl_tlv {
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c

index 514b1a524abbc0ff49bfe25fa2244eb50b860e1c..7469c7dcc15e71fb82e0c8d0a1426cce7b7e2127 100644 (file)
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -24,6 +24,7 @@
  #include <endian.h>
  #include <fcntl.h>
  #include <errno.h>
+#include <ctype.h>
  #include <asm/unistd.h>
  #include <linux/err.h>
  #include <linux/kernel.h>
@@ -1283,7 +1284,7 @@ static size_t bpf_map_mmap_sz(const struct bpf_map *map)
  static char *internal_map_name(struct bpf_object *obj,
                                enum libbpf_map_type type)
  {
-       char map_name[BPF_OBJ_NAME_LEN];
+       char map_name[BPF_OBJ_NAME_LEN], *p;
         const char *sfx = libbpf_type_to_btf_name[type];
         int sfx_len = max((size_t)7, strlen(sfx));
         int pfx_len = min((size_t)BPF_OBJ_NAME_LEN - sfx_len - 1,
@@ -1292,6 +1293,11 @@ static char *internal_map_name(struct bpf_object *obj,
         snprintf(map_name, sizeof(map_name), "%.*s%.*s", pfx_len, obj->name,
                  sfx_len, libbpf_type_to_btf_name[type]);
  
+       /* sanitise map name to characters allowed by kernel */
+       for (p = map_name; *p && p < map_name + sizeof(map_name); p++)
+               if (!isalnum(*p) && *p != '_' && *p != '.')
+                       *p = '_';
+
         return strdup(map_name);
  }
  
diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt

index c4dd23c4b47811e7c896fe90532bdc11ac2580b9..8ead55593984fd7022938442459addf565dd1d91 100644 (file)
--- a/tools/perf/Documentation/perf-config.txt
+++ b/tools/perf/Documentation/perf-config.txt
@@ -239,7 +239,6 @@ buildid.*::
                 set buildid.dir to /dev/null. The default is $HOME/.debug
  
  annotate.*::
-       These options work only for TUI.
         These are in control of addresses, jump function, source code
         in lines of assembly code from a specific program.
  
@@ -269,6 +268,8 @@ annotate.*::
                 │        mov    (%rdi),%rdx
                 │              return n;
  
+               This option works with tui, stdio2 browsers.
+
          annotate.use_offset::
                 Basing on a first address of a loaded function, offset can be used.
                 Instead of using original addresses of assembly code,
@@ -287,6 +288,8 @@ annotate.*::
  
                              368:│  mov    0x8(%r14),%rdi
  
+               This option works with tui, stdio2 browsers.
+
         annotate.jump_arrows::
                 There can be jump instruction among assembly code.
                 Depending on a boolean value of jump_arrows,
@@ -306,6 +309,8 @@ annotate.*::
                 │1330:   mov    %r15,%r10
                 │1333:   cmp    %r15,%r14
  
+               This option works with tui browser.
+
          annotate.show_linenr::
                 When showing source code if this option is 'true',
                 line numbers are printed as below.
@@ -325,6 +330,8 @@ annotate.*::
                 │                     array++;
                 │             }
  
+               This option works with tui, stdio2 browsers.
+
          annotate.show_nr_jumps::
                 Let's see a part of assembly code.
  
@@ -335,6 +342,8 @@ annotate.*::
  
                 │1 1382:   movb   $0x1,-0x270(%rbp)
  
+               This option works with tui, stdio2 browsers.
+
          annotate.show_total_period::
                 To compare two records on an instruction base, with this option
                 provided, display total number of samples that belong to a line
@@ -348,11 +357,30 @@ annotate.*::
  
                 99.93 │      mov    %eax,%eax
  
+               This option works with tui, stdio2, stdio browsers.
+
+       annotate.show_nr_samples::
+               By default perf annotate shows percentage of samples. This option
+               can be used to print absolute number of samples. Ex, when set as
+               false:
+
+               Percent│
+                74.03 │      mov    %fs:0x28,%rax
+
+               When set as true:
+
+               Samples│
+                    6 │      mov    %fs:0x28,%rax
+
+               This option works with tui, stdio2, stdio browsers.
+
         annotate.offset_level::
                 Default is '1', meaning just jump targets will have offsets show right beside
                 the instruction. When set to '2' 'call' instructions will also have its offsets
                 shown, 3 or higher will show offsets for all instructions.
  
+               This option works with tui, stdio2 browsers.
+
  hist.*::
         hist.percentage::
                 This option control the way to calculate overhead of filtered entries -
@@ -490,6 +518,12 @@ top.*::
                 column by default.
                 The default is 'true'.
  
+       top.call-graph::
+               This is identical to 'call-graph.record-mode', except it is
+               applicable only for 'top' subcommand. This option ONLY setup
+               the unwind method. To enable 'perf top' to actually use it,
+               the command line option -g must be specified.
+
  man.*::
         man.viewer::
                 This option can assign a tool to view manual pages when 'help'
@@ -517,6 +551,16 @@ record.*::
                 But if this option is 'no-cache', it will not update the build-id cache.
                 'skip' skips post-processing and does not update the cache.
  
+       record.call-graph::
+               This is identical to 'call-graph.record-mode', except it is
+               applicable only for 'record' subcommand. This option ONLY setup
+               the unwind method. To enable 'perf record' to actually use it,
+               the command line option -g must be specified.
+
+       record.aio::
+               Use 'n' control blocks in asynchronous (Posix AIO) trace writing
+               mode ('n' default: 1, max: 4).
+
  diff.*::
         diff.order::
                 This option sets the number of columns to sort the result.
@@ -566,6 +610,11 @@ trace.*::
                 "libbeauty", the default, to use the same argument beautifiers used in the
                 strace-like sys_enter+sys_exit lines.
  
+ftrace.*::
+       ftrace.tracer::
+               Can be used to select the default tracer. Possible values are
+               'function' and 'function_graph'.
+
  llvm.*::
         llvm.clang-path::
                 Path to clang. If omit, search it from $PATH.
@@ -610,6 +659,29 @@ scripts.*::
         The script gets the same options passed as a full perf script,
         in particular -i perfdata file, --cpu, --tid
  
+convert.*::
+
+       convert.queue-size::
+               Limit the size of ordered_events queue, so we could control
+               allocation size of perf data files without proper finished
+               round events.
+
+intel-pt.*::
+
+       intel-pt.cache-divisor::
+
+       intel-pt.mispred-all::
+               If set, Intel PT decoder will set the mispred flag on all
+               branches.
+
+auxtrace.*::
+
+       auxtrace.dumpdir::
+               s390 only. The directory to save the auxiliary trace buffer
+               can be changed using this option. Ex, auxtrace.dumpdir=/tmp.
+               If the directory does not exist or has the wrong file type,
+               the current directory is used.
+
  SEE ALSO
  --------
  linkperf:perf[1]
diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c

index 2898cfdf8fe18f3640351329a920a2f04df35aca..941f814820b8c653930e03d690ac6647227ef552 100644 (file)
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -858,21 +858,6 @@ static void cs_etm_recording_free(struct auxtrace_record *itr)
         free(ptr);
  }
  
-static int cs_etm_read_finish(struct auxtrace_record *itr, int idx)
-{
-       struct cs_etm_recording *ptr =
-                       container_of(itr, struct cs_etm_recording, itr);
-       struct evsel *evsel;
-
-       evlist__for_each_entry(ptr->evlist, evsel) {
-               if (evsel->core.attr.type == ptr->cs_etm_pmu->type)
-                       return perf_evlist__enable_event_idx(ptr->evlist,
-                                                            evsel, idx);
-       }
-
-       return -EINVAL;
-}
-
  struct auxtrace_record *cs_etm_record_init(int *err)
  {
         struct perf_pmu *cs_etm_pmu;
@@ -892,6 +877,7 @@ struct auxtrace_record *cs_etm_record_init(int *err)
         }
  
         ptr->cs_etm_pmu                 = cs_etm_pmu;
+       ptr->itr.pmu                    = cs_etm_pmu;
         ptr->itr.parse_snapshot_options = cs_etm_parse_snapshot_options;
         ptr->itr.recording_options      = cs_etm_recording_options;
         ptr->itr.info_priv_size         = cs_etm_info_priv_size;
@@ -901,7 +887,7 @@ struct auxtrace_record *cs_etm_record_init(int *err)
         ptr->itr.snapshot_finish        = cs_etm_snapshot_finish;
         ptr->itr.reference              = cs_etm_reference;
         ptr->itr.free                   = cs_etm_recording_free;
-       ptr->itr.read_finish            = cs_etm_read_finish;
+       ptr->itr.read_finish            = auxtrace_record__read_finish;
  
         *err = 0;
         return &ptr->itr;
diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c

index eba6541ec0f12ca8250c3c7d6e33aab1ecf89c1b..8d6821d9c3f6cea8b26c8f5019d5d4f3793ef9b9 100644 (file)
--- a/tools/perf/arch/arm64/util/arm-spe.c
+++ b/tools/perf/arch/arm64/util/arm-spe.c
@@ -158,20 +158,6 @@ static void arm_spe_recording_free(struct auxtrace_record *itr)
         free(sper);
  }
  
-static int arm_spe_read_finish(struct auxtrace_record *itr, int idx)
-{
-       struct arm_spe_recording *sper =
-                       container_of(itr, struct arm_spe_recording, itr);
-       struct evsel *evsel;
-
-       evlist__for_each_entry(sper->evlist, evsel) {
-               if (evsel->core.attr.type == sper->arm_spe_pmu->type)
-                       return perf_evlist__enable_event_idx(sper->evlist,
-                                                            evsel, idx);
-       }
-       return -EINVAL;
-}
-
  struct auxtrace_record *arm_spe_recording_init(int *err,
                                                struct perf_pmu *arm_spe_pmu)
  {
@@ -189,12 +175,13 @@ struct auxtrace_record *arm_spe_recording_init(int *err,
         }
  
         sper->arm_spe_pmu = arm_spe_pmu;
+       sper->itr.pmu = arm_spe_pmu;
         sper->itr.recording_options = arm_spe_recording_options;
         sper->itr.info_priv_size = arm_spe_info_priv_size;
         sper->itr.info_fill = arm_spe_info_fill;
         sper->itr.free = arm_spe_recording_free;
         sper->itr.reference = arm_spe_reference;
-       sper->itr.read_finish = arm_spe_read_finish;
+       sper->itr.read_finish = auxtrace_record__read_finish;
         sper->itr.alignment = 0;
  
         *err = 0;
diff --git a/tools/perf/arch/arm64/util/header.c b/tools/perf/arch/arm64/util/header.c

index a32e4b72a98f0f615fa6cc237e681e2ee0776441..d730666ab95d21c4d2b834776d552980e9c6fa27 100644 (file)
--- a/tools/perf/arch/arm64/util/header.c
+++ b/tools/perf/arch/arm64/util/header.c
@@ -1,8 +1,10 @@
  #include <stdio.h>
  #include <stdlib.h>
  #include <perf/cpumap.h>
+#include <util/cpumap.h>
  #include <internal/cpumap.h>
  #include <api/fs/fs.h>
+#include <errno.h>
  #include "debug.h"
  #include "header.h"
  
@@ -12,26 +14,21 @@
  #define MIDR_VARIANT_SHIFT      20
  #define MIDR_VARIANT_MASK       (0xf << MIDR_VARIANT_SHIFT)
  
-char *get_cpuid_str(struct perf_pmu *pmu)
+static int _get_cpuid(char *buf, size_t sz, struct perf_cpu_map *cpus)
  {
-       char *buf = NULL;
-       char path[PATH_MAX];
         const char *sysfs = sysfs__mountpoint();
-       int cpu;
         u64 midr = 0;
-       struct perf_cpu_map *cpus;
-       FILE *file;
+       int cpu;
  
-       if (!sysfs || !pmu || !pmu->cpus)
-               return NULL;
+       if (!sysfs || sz < MIDR_SIZE)
+               return EINVAL;
  
-       buf = malloc(MIDR_SIZE);
-       if (!buf)
-               return NULL;
+       cpus = perf_cpu_map__get(cpus);
  
-       /* read midr from list of cpus mapped to this pmu */
-       cpus = perf_cpu_map__get(pmu->cpus);
         for (cpu = 0; cpu < perf_cpu_map__nr(cpus); cpu++) {
+               char path[PATH_MAX];
+               FILE *file;
+
                 scnprintf(path, PATH_MAX, "%s/devices/system/cpu/cpu%d"MIDR,
                                 sysfs, cpus->map[cpu]);
  
@@ -57,12 +54,48 @@ char *get_cpuid_str(struct perf_pmu *pmu)
                 break;
         }
  
-       if (!midr) {
+       perf_cpu_map__put(cpus);
+
+       if (!midr)
+               return EINVAL;
+
+       return 0;
+}
+
+int get_cpuid(char *buf, size_t sz)
+{
+       struct perf_cpu_map *cpus = perf_cpu_map__new(NULL);
+       int ret;
+
+       if (!cpus)
+               return EINVAL;
+
+       ret = _get_cpuid(buf, sz, cpus);
+
+       perf_cpu_map__put(cpus);
+
+       return ret;
+}
+
+char *get_cpuid_str(struct perf_pmu *pmu)
+{
+       char *buf = NULL;
+       int res;
+
+       if (!pmu || !pmu->cpus)
+               return NULL;
+
+       buf = malloc(MIDR_SIZE);
+       if (!buf)
+               return NULL;
+
+       /* read midr from list of cpus mapped to this pmu */
+       res = _get_cpuid(buf, MIDR_SIZE, pmu->cpus);
+       if (res) {
                 pr_err("failed to get cpuid string for PMU %s\n", pmu->name);
                 free(buf);
                 buf = NULL;
         }
  
-       perf_cpu_map__put(cpus);
         return buf;
  }
diff --git a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl

index 43f736ed47f28a1dff42ffb7ae7b3e19613e0211..35b61bfc1b1ae928158dee422a150c19c8d30e4e 100644 (file)
--- a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
+++ b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
@@ -517,3 +517,5 @@
  433    common  fspick                          sys_fspick
  434    common  pidfd_open                      sys_pidfd_open
  435    nospu   clone3                          ppc_clone3
+437    common  openat2                         sys_openat2
+438    common  pidfd_getfd                     sys_pidfd_getfd
diff --git a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl

index c29976eca4a8a86bdd2fc062b3e521afc0338301..44d510bc9b7877a18c082ceb168f01e94db0417b 100644 (file)
--- a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
@@ -357,6 +357,8 @@
  433    common  fspick                  __x64_sys_fspick
  434    common  pidfd_open              __x64_sys_pidfd_open
  435    common  clone3                  __x64_sys_clone3/ptregs
+437    common  openat2                 __x64_sys_openat2
+438    common  pidfd_getfd             __x64_sys_pidfd_getfd
  
  #
  # x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/tools/perf/arch/x86/util/intel-bts.c b/tools/perf/arch/x86/util/intel-bts.c

index 27d9e214d068074d37ff0e5287bfba047819ea56..26cee10521794be3a91741860c1a93f178c07ab6 100644 (file)
--- a/tools/perf/arch/x86/util/intel-bts.c
+++ b/tools/perf/arch/x86/util/intel-bts.c
@@ -413,20 +413,6 @@ out_err:
         return err;
  }
  
-static int intel_bts_read_finish(struct auxtrace_record *itr, int idx)
-{
-       struct intel_bts_recording *btsr =
-                       container_of(itr, struct intel_bts_recording, itr);
-       struct evsel *evsel;
-
-       evlist__for_each_entry(btsr->evlist, evsel) {
-               if (evsel->core.attr.type == btsr->intel_bts_pmu->type)
-                       return perf_evlist__enable_event_idx(btsr->evlist,
-                                                            evsel, idx);
-       }
-       return -EINVAL;
-}
-
  struct auxtrace_record *intel_bts_recording_init(int *err)
  {
         struct perf_pmu *intel_bts_pmu = perf_pmu__find(INTEL_BTS_PMU_NAME);
@@ -447,6 +433,7 @@ struct auxtrace_record *intel_bts_recording_init(int *err)
         }
  
         btsr->intel_bts_pmu = intel_bts_pmu;
+       btsr->itr.pmu = intel_bts_pmu;
         btsr->itr.recording_options = intel_bts_recording_options;
         btsr->itr.info_priv_size = intel_bts_info_priv_size;
         btsr->itr.info_fill = intel_bts_info_fill;
@@ -456,7 +443,7 @@ struct auxtrace_record *intel_bts_recording_init(int *err)
         btsr->itr.find_snapshot = intel_bts_find_snapshot;
         btsr->itr.parse_snapshot_options = intel_bts_parse_snapshot_options;
         btsr->itr.reference = intel_bts_reference;
-       btsr->itr.read_finish = intel_bts_read_finish;
+       btsr->itr.read_finish = auxtrace_record__read_finish;
         btsr->itr.alignment = sizeof(struct branch);
         return &btsr->itr;
  }
diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c

index 20df442fdf36d930e603122bb19acbf578a93765..7eea4fd7ce58555256618410eccc462ee980dc93 100644 (file)
--- a/tools/perf/arch/x86/util/intel-pt.c
+++ b/tools/perf/arch/x86/util/intel-pt.c
@@ -1166,20 +1166,6 @@ static u64 intel_pt_reference(struct auxtrace_record *itr __maybe_unused)
         return rdtsc();
  }
  
-static int intel_pt_read_finish(struct auxtrace_record *itr, int idx)
-{
-       struct intel_pt_recording *ptr =
-                       container_of(itr, struct intel_pt_recording, itr);
-       struct evsel *evsel;
-
-       evlist__for_each_entry(ptr->evlist, evsel) {
-               if (evsel->core.attr.type == ptr->intel_pt_pmu->type)
-                       return perf_evlist__enable_event_idx(ptr->evlist, evsel,
-                                                            idx);
-       }
-       return -EINVAL;
-}
-
  struct auxtrace_record *intel_pt_recording_init(int *err)
  {
         struct perf_pmu *intel_pt_pmu = perf_pmu__find(INTEL_PT_PMU_NAME);
@@ -1200,6 +1186,7 @@ struct auxtrace_record *intel_pt_recording_init(int *err)
         }
  
         ptr->intel_pt_pmu = intel_pt_pmu;
+       ptr->itr.pmu = intel_pt_pmu;
         ptr->itr.recording_options = intel_pt_recording_options;
         ptr->itr.info_priv_size = intel_pt_info_priv_size;
         ptr->itr.info_fill = intel_pt_info_fill;
@@ -1209,7 +1196,7 @@ struct auxtrace_record *intel_pt_recording_init(int *err)
         ptr->itr.find_snapshot = intel_pt_find_snapshot;
         ptr->itr.parse_snapshot_options = intel_pt_parse_snapshot_options;
         ptr->itr.reference = intel_pt_reference;
-       ptr->itr.read_finish = intel_pt_read_finish;
+       ptr->itr.read_finish = auxtrace_record__read_finish;
         /*
          * Decoding starts at a PSB packet. Minimum PSB period is 2K so 4K
          * should give at least 1 PSB per sample.
diff --git a/tools/perf/bench/bench.h b/tools/perf/bench/bench.h

index fddb3ced9db620f8700faa82494d80c5b2ee1c24..4aa6de1aa67dc6a7f095d135e95f8b406cc7f5c1 100644 (file)
--- a/tools/perf/bench/bench.h
+++ b/tools/perf/bench/bench.h
@@ -2,6 +2,10 @@
  #ifndef BENCH_H
  #define BENCH_H
  
+#include <sys/time.h>
+
+extern struct timeval bench__start, bench__end, bench__runtime;
+
  /*
   * The madvise transparent hugepage constants were added in glibc
   * 2.13. For compatibility with older versions of glibc, define these
diff --git a/tools/perf/bench/epoll-ctl.c b/tools/perf/bench/epoll-ctl.c

index bb617e56884129ce83bf4f7db4f79fb0c7f1a95a..a7526c05df3827c1ddd58249966f1b7be238ec5f 100644 (file)
--- a/tools/perf/bench/epoll-ctl.c
+++ b/tools/perf/bench/epoll-ctl.c
@@ -35,7 +35,6 @@
  
  static unsigned int nthreads = 0;
  static unsigned int nsecs    = 8;
-struct timeval start, end, runtime;
  static bool done, __verbose, randomize;
  
  /*
@@ -94,8 +93,8 @@ static void toggle_done(int sig __maybe_unused,
  {
         /* inform all threads that we're done for the day */
         done = true;
-       gettimeofday(&end, NULL);
-       timersub(&end, &start, &runtime);
+       gettimeofday(&bench__end, NULL);
+       timersub(&bench__end, &bench__start, &bench__runtime);
  }
  
  static void nest_epollfd(void)
@@ -361,7 +360,7 @@ int bench_epoll_ctl(int argc, const char **argv)
  
         threads_starting = nthreads;
  
-       gettimeofday(&start, NULL);
+       gettimeofday(&bench__start, NULL);
  
         do_threads(worker, cpu);
  
diff --git a/tools/perf/bench/epoll-wait.c b/tools/perf/bench/epoll-wait.c

index 7af694437f4ead2adce1cb9079291bd2d134dfcc..d1c5cb526b9ff0ea40a542b918d68194e54e35d7 100644 (file)
--- a/tools/perf/bench/epoll-wait.c
+++ b/tools/perf/bench/epoll-wait.c
@@ -90,7 +90,6 @@
  
  static unsigned int nthreads = 0;
  static unsigned int nsecs    = 8;
-struct timeval start, end, runtime;
  static bool wdone, done, __verbose, randomize, nonblocking;
  
  /*
@@ -276,8 +275,8 @@ static void toggle_done(int sig __maybe_unused,
  {
         /* inform all threads that we're done for the day */
         done = true;
-       gettimeofday(&end, NULL);
-       timersub(&end, &start, &runtime);
+       gettimeofday(&bench__end, NULL);
+       timersub(&bench__end, &bench__start, &bench__runtime);
  }
  
  static void print_summary(void)
@@ -287,7 +286,7 @@ static void print_summary(void)
  
         printf("\nAveraged %ld operations/sec (+- %.2f%%), total secs = %d\n",
                avg, rel_stddev_stats(stddev, avg),
-              (int) runtime.tv_sec);
+              (int)bench__runtime.tv_sec);
  }
  
  static int do_threads(struct worker *worker, struct perf_cpu_map *cpu)
@@ -479,7 +478,7 @@ int bench_epoll_wait(int argc, const char **argv)
  
         threads_starting = nthreads;
  
-       gettimeofday(&start, NULL);
+       gettimeofday(&bench__start, NULL);
  
         do_threads(worker, cpu);
  
@@ -519,7 +518,7 @@ int bench_epoll_wait(int argc, const char **argv)
                 qsort(worker, nthreads, sizeof(struct worker), cmpworker);
  
         for (i = 0; i < nthreads; i++) {
-               unsigned long t = worker[i].ops/runtime.tv_sec;
+               unsigned long t = worker[i].ops / bench__runtime.tv_sec;
  
                 update_stats(&throughput_stats, t);
  
diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c

index 8ba0c3330a9a2af7a3483d2ad13e85d71350e046..21776862e940feafc0bc826ba1979d3ffc25e20c 100644 (file)
--- a/tools/perf/bench/futex-hash.c
+++ b/tools/perf/bench/futex-hash.c
@@ -37,7 +37,7 @@ static unsigned int nfutexes = 1024;
  static bool fshared = false, done = false, silent = false;
  static int futex_flag = 0;
  
-struct timeval start, end, runtime;
+struct timeval bench__start, bench__end, bench__runtime;
  static pthread_mutex_t thread_lock;
  static unsigned int threads_starting;
  static struct stats throughput_stats;
@@ -103,8 +103,8 @@ static void toggle_done(int sig __maybe_unused,
  {
         /* inform all threads that we're done for the day */
         done = true;
-       gettimeofday(&end, NULL);
-       timersub(&end, &start, &runtime);
+       gettimeofday(&bench__end, NULL);
+       timersub(&bench__end, &bench__start, &bench__runtime);
  }
  
  static void print_summary(void)
@@ -114,7 +114,7 @@ static void print_summary(void)
  
         printf("%sAveraged %ld operations/sec (+- %.2f%%), total secs = %d\n",
                !silent ? "\n" : "", avg, rel_stddev_stats(stddev, avg),
-              (int) runtime.tv_sec);
+              (int)bench__runtime.tv_sec);
  }
  
  int bench_futex_hash(int argc, const char **argv)
@@ -161,7 +161,7 @@ int bench_futex_hash(int argc, const char **argv)
  
         threads_starting = nthreads;
         pthread_attr_init(&thread_attr);
-       gettimeofday(&start, NULL);
+       gettimeofday(&bench__start, NULL);
         for (i = 0; i < nthreads; i++) {
                 worker[i].tid = i;
                 worker[i].futex = calloc(nfutexes, sizeof(*worker[i].futex));
@@ -204,7 +204,7 @@ int bench_futex_hash(int argc, const char **argv)
         pthread_mutex_destroy(&thread_lock);
  
         for (i = 0; i < nthreads; i++) {
-               unsigned long t = worker[i].ops/runtime.tv_sec;
+               unsigned long t = worker[i].ops / bench__runtime.tv_sec;
                 update_stats(&throughput_stats, t);
                 if (!silent) {
                         if (nfutexes == 1)
diff --git a/tools/perf/bench/futex-lock-pi.c b/tools/perf/bench/futex-lock-pi.c

index d0cae8125423f69a76f2435c6c8ee01e926f69ea..30d97121dc4fb9352ef114cb7e5a0080a1721b10 100644 (file)
--- a/tools/perf/bench/futex-lock-pi.c
+++ b/tools/perf/bench/futex-lock-pi.c
@@ -37,7 +37,6 @@ static bool silent = false, multi = false;
  static bool done = false, fshared = false;
  static unsigned int nthreads = 0;
  static int futex_flag = 0;
-struct timeval start, end, runtime;
  static pthread_mutex_t thread_lock;
  static unsigned int threads_starting;
  static struct stats throughput_stats;
@@ -64,7 +63,7 @@ static void print_summary(void)
  
         printf("%sAveraged %ld operations/sec (+- %.2f%%), total secs = %d\n",
                !silent ? "\n" : "", avg, rel_stddev_stats(stddev, avg),
-              (int) runtime.tv_sec);
+              (int)bench__runtime.tv_sec);
  }
  
  static void toggle_done(int sig __maybe_unused,
@@ -73,8 +72,8 @@ static void toggle_done(int sig __maybe_unused,
  {
         /* inform all threads that we're done for the day */
         done = true;
-       gettimeofday(&end, NULL);
-       timersub(&end, &start, &runtime);
+       gettimeofday(&bench__end, NULL);
+       timersub(&bench__end, &bench__start, &bench__runtime);
  }
  
  static void *workerfn(void *arg)
@@ -185,7 +184,7 @@ int bench_futex_lock_pi(int argc, const char **argv)
  
         threads_starting = nthreads;
         pthread_attr_init(&thread_attr);
-       gettimeofday(&start, NULL);
+       gettimeofday(&bench__start, NULL);
  
         create_threads(worker, thread_attr, cpu);
         pthread_attr_destroy(&thread_attr);
@@ -211,7 +210,7 @@ int bench_futex_lock_pi(int argc, const char **argv)
         pthread_mutex_destroy(&thread_lock);
  
         for (i = 0; i < nthreads; i++) {
-               unsigned long t = worker[i].ops/runtime.tv_sec;
+               unsigned long t = worker[i].ops / bench__runtime.tv_sec;
  
                 update_stats(&throughput_stats, t);
                 if (!silent)
diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c

index ff61795a4d13783011cd25682e4894b61da21643..6c0a0412502ebb5e731156f47fcf87ccb2b9b55b 100644 (file)
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -566,6 +566,8 @@ int cmd_annotate(int argc, const char **argv)
         if (ret < 0)
                 return ret;
  
+       annotation_config__init(&annotate.opts);
+
         argc = parse_options(argc, argv, options, annotate_usage, 0);
         if (argc) {
                 /*
@@ -605,8 +607,6 @@ int cmd_annotate(int argc, const char **argv)
         if (ret < 0)
                 goto out_delete;
  
-       annotation_config__init();
-
         symbol_conf.try_vmlinux_path = true;
  
         ret = symbol__init(&annotate.session->header.env);
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c

index f8b6ae557d8bd7b7750a8fccf53ae338059ef91e..c03c36fde7e2f3a0d31146c948a38268560227f9 100644 (file)
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -1312,7 +1312,8 @@ static int cycles_printf(struct hist_entry *he, struct hist_entry *pair,
         end_line = map__srcline(he->ms.map, bi->sym->start + bi->end,
                                 he->ms.sym);
  
-       if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
+       if ((strncmp(start_line, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0) &&
+           (strncmp(end_line, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0)) {
                 scnprintf(buf, sizeof(buf), "[%s -> %s] %4ld",
                           start_line, end_line, block_he->diff.cycles);
         } else {
diff --git a/tools/perf/builtin-probe.c b/tools/perf/builtin-probe.c

index 26bc5923e6b56c0a3959212de58de8b36a8832e8..70548df2abb92f797721684e9a687a5b0fdacbf3 100644 (file)
--- a/tools/perf/builtin-probe.c
+++ b/tools/perf/builtin-probe.c
@@ -449,7 +449,8 @@ static int perf_del_probe_events(struct strfilter *filter)
                 ret = probe_file__del_strlist(kfd, klist);
                 if (ret < 0)
                         goto error;
-       }
+       } else if (ret == -ENOMEM)
+               goto error;
  
         ret2 = probe_file__get_events(ufd, filter, ulist);
         if (ret2 == 0) {
@@ -459,7 +460,8 @@ static int perf_del_probe_events(struct strfilter *filter)
                 ret2 = probe_file__del_strlist(ufd, ulist);
                 if (ret2 < 0)
                         goto error;
-       }
+       } else if (ret2 == -ENOMEM)
+               goto error;
  
         if (ret == -ENOENT && ret2 == -ENOENT)
                 pr_warning("\"%s\" does not hit any event.\n", str);
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c

index 9483b3f0cae3f50004d0a6ea9e4ede717d185627..72a12b69f120b959b0d20e6441eb7197e8852f32 100644 (file)
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1507,7 +1507,7 @@ repeat:
                         symbol_conf.priv_size += sizeof(u32);
                         symbol_conf.sort_by_name = true;
                 }
-               annotation_config__init();
+               annotation_config__init(&report.annotation_opts);
         }
  
         if (symbol__init(&session->header.env) < 0)
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c

index 8affcab756043dc4c31c43ccc43443446afaf37a..f6dd1a63f159e970041d8614b0d0223cdf4a8048 100644 (file)
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -143,7 +143,7 @@ static int perf_top__parse_source(struct perf_top *top, struct hist_entry *he)
                 return err;
         }
  
-       err = symbol__annotate(&he->ms, evsel, 0, &top->annotation_opts, NULL);
+       err = symbol__annotate(&he->ms, evsel, &top->annotation_opts, NULL);
         if (err == 0) {
                 top->sym_filter_entry = he;
         } else {
@@ -1683,7 +1683,7 @@ int cmd_top(int argc, const char **argv)
         if (status < 0)
                 goto out_delete_evlist;
  
-       annotation_config__init();
+       annotation_config__init(&top.annotation_opts);
  
         symbol_conf.try_vmlinux_path = (symbol_conf.vmlinux_name == NULL);
         status = symbol__init(NULL);
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c

index 46a72ecac427f33a16d0d7d0ff7d713a306b8464..01d542007c8b1210b9fbf348bfb8ccafb849c645 100644 (file)
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1065,7 +1065,9 @@ static struct syscall_fmt syscall_fmts[] = {
         { .name     = "poll", .timeout = true, },
         { .name     = "ppoll", .timeout = true, },
         { .name     = "prctl",
-         .arg = { [0] = { .scnprintf = SCA_PRCTL_OPTION, /* option */ },
+         .arg = { [0] = { .scnprintf = SCA_PRCTL_OPTION, /* option */
+                          .strtoul   = STUL_STRARRAY,
+                          .parm      = &strarray__prctl_options, },
                    [1] = { .scnprintf = SCA_PRCTL_ARG2, /* arg2 */ },
                    [2] = { .scnprintf = SCA_PRCTL_ARG3, /* arg3 */ }, }, },
         { .name     = "pread", .alias = "pread64", },
diff --git a/tools/perf/check-headers.sh b/tools/perf/check-headers.sh

index 68039a96c1dcaa7a2b9c5f54882ce39b98c71b0e..bfb21d049e6ce1569c301ac8813c7b3488cd2244 100755 (executable)
--- a/tools/perf/check-headers.sh
+++ b/tools/perf/check-headers.sh
@@ -13,6 +13,7 @@ include/uapi/linux/kcmp.h
  include/uapi/linux/kvm.h
  include/uapi/linux/in.h
  include/uapi/linux/mount.h
+include/uapi/linux/openat2.h
  include/uapi/linux/perf_event.h
  include/uapi/linux/prctl.h
  include/uapi/linux/sched.h
diff --git a/tools/perf/include/bpf/pid_filter.h b/tools/perf/include/bpf/pid_filter.h

index 607189a315b2cbd32f36878d0fcdfb97b1001685..6e61c4bdf54826ce6538c98984f2751d6ddff755 100644 (file)
--- a/tools/perf/include/bpf/pid_filter.h
+++ b/tools/perf/include/bpf/pid_filter.h
@@ -3,7 +3,7 @@
  #ifndef _PERF_BPF_PID_FILTER_
  #define _PERF_BPF_PID_FILTER_
  
-#include <bpf/bpf.h>
+#include <bpf.h>
  
  #define pid_filter(name) pid_map(name, bool)
  
diff --git a/tools/perf/include/bpf/stdio.h b/tools/perf/include/bpf/stdio.h

index 7ca6fa5463eea9faebb2dda6e561e98654da9692..316af5b2ff3516b3aba5365423c69e24f3f5dee5 100644 (file)
--- a/tools/perf/include/bpf/stdio.h
+++ b/tools/perf/include/bpf/stdio.h
@@ -1,6 +1,6 @@
  // SPDX-License-Identifier: GPL-2.0
  
-#include <bpf/bpf.h>
+#include <bpf.h>
  
  struct bpf_map SEC("maps") __bpf_stdout__ = {
         .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
diff --git a/tools/perf/include/bpf/unistd.h b/tools/perf/include/bpf/unistd.h

index d1a35b6c649dc7b860ccb661f3e71ba0b595ce69..ca7877f9a976fbcaf55beb5196eedcc8eed9351b 100644 (file)
--- a/tools/perf/include/bpf/unistd.h
+++ b/tools/perf/include/bpf/unistd.h
@@ -1,6 +1,6 @@
  // SPDX-License-Identifier: LGPL-2.1
  
-#include <bpf/bpf.h>
+#include <bpf.h>
  
  static int (*bpf_get_current_pid_tgid)(void) = (void *)BPF_FUNC_get_current_pid_tgid;
  
diff --git a/tools/perf/tests/bp_account.c b/tools/perf/tests/bp_account.c

index d0b935356274b2297970f28ca319dd9205602284..489b50604cf274046b879c54eb0e2a9037236ff8 100644 (file)
--- a/tools/perf/tests/bp_account.c
+++ b/tools/perf/tests/bp_account.c
@@ -19,7 +19,7 @@
  #include "../perf-sys.h"
  #include "cloexec.h"
  
-volatile long the_var;
+static volatile long the_var;
  
  static noinline int test_function(void)
  {
diff --git a/tools/perf/tests/shell/lib/probe_vfs_getname.sh b/tools/perf/tests/shell/lib/probe_vfs_getname.sh

index 7cb99b433888b80d1b56b6331748b3db8414c0ca..c2cc42daf924235762a7528b89c858c7ebb9d062 100644 (file)
--- a/tools/perf/tests/shell/lib/probe_vfs_getname.sh
+++ b/tools/perf/tests/shell/lib/probe_vfs_getname.sh
@@ -14,7 +14,7 @@ add_probe_vfs_getname() {
         if [ $had_vfs_getname -eq 1 ] ; then
                 line=$(perf probe -L getname_flags 2>&1 | egrep 'result.*=.*filename;' | sed -r 's/[[:space:]]+([[:digit:]]+)[[:space:]]+result->uptr.*/\1/')
                 perf probe -q       "vfs_getname=getname_flags:${line} pathname=result->name:string" || \
-               perf probe $verbose "vfs_getname=getname_flags:${line} pathname=filename:string"
+               perf probe $verbose "vfs_getname=getname_flags:${line} pathname=filename:ustring"
         fi
  }
  
diff --git a/tools/perf/trace/beauty/beauty.h b/tools/perf/trace/beauty/beauty.h

index 5a61043c2ff732e483ae57ad5ca40c5a86811b40..d6dfe68a7612552ab0eebb2022064c660737566a 100644 (file)
--- a/tools/perf/trace/beauty/beauty.h
+++ b/tools/perf/trace/beauty/beauty.h
@@ -213,6 +213,8 @@ size_t syscall_arg__scnprintf_x86_arch_prctl_code(char *bf, size_t size, struct
  size_t syscall_arg__scnprintf_prctl_option(char *bf, size_t size, struct syscall_arg *arg);
  #define SCA_PRCTL_OPTION syscall_arg__scnprintf_prctl_option
  
+extern struct strarray strarray__prctl_options;
+
  size_t syscall_arg__scnprintf_prctl_arg2(char *bf, size_t size, struct syscall_arg *arg);
  #define SCA_PRCTL_ARG2 syscall_arg__scnprintf_prctl_arg2
  
diff --git a/tools/perf/trace/beauty/prctl.c b/tools/perf/trace/beauty/prctl.c

index ba2179abed00982e0f3fce9a7c66bc9780a083b3..6fe5ad5f5d3a4e4b8c043518215075874fe09892 100644 (file)
--- a/tools/perf/trace/beauty/prctl.c
+++ b/tools/perf/trace/beauty/prctl.c
@@ -11,9 +11,10 @@
  
  #include "trace/beauty/generated/prctl_option_array.c"
  
+DEFINE_STRARRAY(prctl_options, "PR_");
+
  static size_t prctl__scnprintf_option(int option, char *bf, size_t size, bool show_prefix)
  {
-       static DEFINE_STRARRAY(prctl_options, "PR_");
         return strarray__scnprintf(&strarray__prctl_options, bf, size, "%d", show_prefix, option);
  }
  
diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c

index badbddbb30f813b997bc43d8ce5744a55674dce9..9023267e564335ce9005288e35e60a9a86703b71 100644 (file)
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -754,10 +754,9 @@ static int annotate_browser__run(struct annotate_browser *browser,
                 "?             Search string backwards\n");
                         continue;
                 case 'r':
-                       {
-                               script_browse(NULL, NULL);
-                               continue;
-                       }
+                       script_browse(NULL, NULL);
+                       annotate_browser__show(&browser->b, title, help);
+                       continue;
                 case 'k':
                         notes->options->show_linenr = !notes->options->show_linenr;
                         break;
@@ -834,13 +833,13 @@ show_sup_ins:
                         map_symbol__annotation_dump(ms, evsel, browser->opts);
                         continue;
                 case 't':
-                       if (notes->options->show_total_period) {
-                               notes->options->show_total_period = false;
-                               notes->options->show_nr_samples = true;
-                       } else if (notes->options->show_nr_samples)
-                               notes->options->show_nr_samples = false;
+                       if (symbol_conf.show_total_period) {
+                               symbol_conf.show_total_period = false;
+                               symbol_conf.show_nr_samples = true;
+                       } else if (symbol_conf.show_nr_samples)
+                               symbol_conf.show_nr_samples = false;
                         else
-                               notes->options->show_total_period = true;
+                               symbol_conf.show_total_period = true;
                         annotation__update_column_widths(notes);
                         continue;
                 case 'c':
diff --git a/tools/perf/ui/gtk/annotate.c b/tools/perf/ui/gtk/annotate.c

index 22cc240f73713852342d75ea96249eeea2a339db..35f9641bf670cb5bc9103efffa8f060e9d8d4627 100644 (file)
--- a/tools/perf/ui/gtk/annotate.c
+++ b/tools/perf/ui/gtk/annotate.c
@@ -174,7 +174,7 @@ static int symbol__gtk_annotate(struct map_symbol *ms, struct evsel *evsel,
         if (ms->map->dso->annotate_warned)
                 return -1;
  
-       err = symbol__annotate(ms, evsel, 0, &annotation__default_options, NULL);
+       err = symbol__annotate(ms, evsel, &annotation__default_options, NULL);
         if (err) {
                 char msg[BUFSIZ];
                 symbol__strerror_disassemble(ms, err, msg, sizeof(msg));
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c

index ca73fb74ad03273464abe6bb86455140495542ca..0ea95be84b3bd32d488491dd5f2272623d928085 100644 (file)
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1143,93 +1143,70 @@ out:
  }
  
  struct annotate_args {
-       size_t                   privsize;
-       struct arch             *arch;
-       struct map_symbol        ms;
-       struct evsel    *evsel;
+       struct arch               *arch;
+       struct map_symbol         ms;
+       struct evsel              *evsel;
         struct annotation_options *options;
-       s64                      offset;
-       char                    *line;
-       int                      line_nr;
+       s64                       offset;
+       char                      *line;
+       int                       line_nr;
  };
  
-static void annotation_line__delete(struct annotation_line *al)
+static void annotation_line__init(struct annotation_line *al,
+                                 struct annotate_args *args,
+                                 int nr)
  {
-       void *ptr = (void *) al - al->privsize;
+       al->offset = args->offset;
+       al->line = strdup(args->line);
+       al->line_nr = args->line_nr;
+       al->data_nr = nr;
+}
  
+static void annotation_line__exit(struct annotation_line *al)
+{
         free_srcline(al->path);
         zfree(&al->line);
-       free(ptr);
  }
  
-/*
- * Allocating the annotation line data with following
- * structure:
- *
- *    --------------------------------------
- *    private space | struct annotation_line
- *    --------------------------------------
- *
- * Size of the private space is stored in 'struct annotation_line'.
- *
- */
-static struct annotation_line *
-annotation_line__new(struct annotate_args *args, size_t privsize)
+static size_t disasm_line_size(int nr)
  {
         struct annotation_line *al;
-       struct evsel *evsel = args->evsel;
-       size_t size = privsize + sizeof(*al);
-       int nr = 1;
-
-       if (perf_evsel__is_group_event(evsel))
-               nr = evsel->core.nr_members;
  
-       size += sizeof(al->data[0]) * nr;
-
-       al = zalloc(size);
-       if (al) {
-               al = (void *) al + privsize;
-               al->privsize   = privsize;
-               al->offset     = args->offset;
-               al->line       = strdup(args->line);
-               al->line_nr    = args->line_nr;
-               al->data_nr    = nr;
-       }
-
-       return al;
+       return (sizeof(struct disasm_line) + (sizeof(al->data[0]) * nr));
  }
  
  /*
   * Allocating the disasm annotation line data with
   * following structure:
   *
- *    ------------------------------------------------------------
- *    privsize space | struct disasm_line | struct annotation_line
- *    ------------------------------------------------------------
+ *    -------------------------------------------
+ *    struct disasm_line | struct annotation_line
+ *    -------------------------------------------
   *
   * We have 'struct annotation_line' member as last member
   * of 'struct disasm_line' to have an easy access.
- *
   */
  static struct disasm_line *disasm_line__new(struct annotate_args *args)
  {
         struct disasm_line *dl = NULL;
-       struct annotation_line *al;
-       size_t privsize = args->privsize + offsetof(struct disasm_line, al);
+       int nr = 1;
  
-       al = annotation_line__new(args, privsize);
-       if (al != NULL) {
-               dl = disasm_line(al);
+       if (perf_evsel__is_group_event(args->evsel))
+               nr = args->evsel->core.nr_members;
  
-               if (dl->al.line == NULL)
-                       goto out_delete;
+       dl = zalloc(disasm_line_size(nr));
+       if (!dl)
+               return NULL;
  
-               if (args->offset != -1) {
-                       if (disasm_line__parse(dl->al.line, &dl->ins.name, &dl->ops.raw) < 0)
-                               goto out_free_line;
+       annotation_line__init(&dl->al, args, nr);
+       if (dl->al.line == NULL)
+               goto out_delete;
  
-                       disasm_line__init_ins(dl, args->arch, &args->ms);
-               }
+       if (args->offset != -1) {
+               if (disasm_line__parse(dl->al.line, &dl->ins.name, &dl->ops.raw) < 0)
+                       goto out_free_line;
+
+               disasm_line__init_ins(dl, args->arch, &args->ms);
         }
  
         return dl;
@@ -1248,7 +1225,8 @@ void disasm_line__free(struct disasm_line *dl)
         else
                 ins__delete(&dl->ops);
         zfree(&dl->ins.name);
-       annotation_line__delete(&dl->al);
+       annotation_line__exit(&dl->al);
+       free(dl);
  }
  
  int disasm_line__scnprintf(struct disasm_line *dl, char *bf, size_t size, bool raw, int max_ins_name)
@@ -2149,13 +2127,12 @@ void symbol__calc_percent(struct symbol *sym, struct evsel *evsel)
         annotation__calc_percent(notes, evsel, symbol__size(sym));
  }
  
-int symbol__annotate(struct map_symbol *ms, struct evsel *evsel, size_t privsize,
+int symbol__annotate(struct map_symbol *ms, struct evsel *evsel,
                      struct annotation_options *options, struct arch **parch)
  {
         struct symbol *sym = ms->sym;
         struct annotation *notes = symbol__annotation(sym);
         struct annotate_args args = {
-               .privsize       = privsize,
                 .evsel          = evsel,
                 .options        = options,
         };
@@ -2644,6 +2621,8 @@ void annotation__set_offsets(struct annotation *notes, s64 size)
         struct annotation_line *al;
  
         notes->max_line_len = 0;
+       notes->nr_entries = 0;
+       notes->nr_asm_entries = 0;
  
         list_for_each_entry(al, &notes->src->source, node) {
                 size_t line_len = strlen(al->line);
@@ -2790,7 +2769,7 @@ int symbol__tty_annotate(struct map_symbol *ms, struct evsel *evsel,
         struct symbol *sym = ms->sym;
         struct rb_root source_line = RB_ROOT;
  
-       if (symbol__annotate(ms, evsel, 0, opts, NULL) < 0)
+       if (symbol__annotate(ms, evsel, opts, NULL) < 0)
                 return -1;
  
         symbol__calc_percent(sym, evsel);
@@ -2915,9 +2894,9 @@ static void __annotation_line__write(struct annotation_line *al, struct annotati
                         percent = annotation_data__percent(&al->data[i], percent_type);
  
                         obj__set_percent_color(obj, percent, current_entry);
-                       if (notes->options->show_total_period) {
+                       if (symbol_conf.show_total_period) {
                                 obj__printf(obj, "%11" PRIu64 " ", al->data[i].he.period);
-                       } else if (notes->options->show_nr_samples) {
+                       } else if (symbol_conf.show_nr_samples) {
                                 obj__printf(obj, "%6" PRIu64 " ",
                                                    al->data[i].he.nr_samples);
                         } else {
@@ -2931,8 +2910,8 @@ static void __annotation_line__write(struct annotation_line *al, struct annotati
                         obj__printf(obj, "%-*s", pcnt_width, " ");
                 else {
                         obj__printf(obj, "%-*s", pcnt_width,
-                                          notes->options->show_total_period ? "Period" :
-                                          notes->options->show_nr_samples ? "Samples" : "Percent");
+                                          symbol_conf.show_total_period ? "Period" :
+                                          symbol_conf.show_nr_samples ? "Samples" : "Percent");
                 }
         }
  
@@ -3070,7 +3049,7 @@ int symbol__annotate2(struct map_symbol *ms, struct evsel *evsel,
         if (perf_evsel__is_group_event(evsel))
                 nr_pcnt = evsel->core.nr_members;
  
-       err = symbol__annotate(ms, evsel, 0, options, parch);
+       err = symbol__annotate(ms, evsel, options, parch);
         if (err)
                 goto out_free_offsets;
  
@@ -3094,69 +3073,46 @@ out_free_offsets:
         return err;
  }
  
-#define ANNOTATION__CFG(n) \
-       { .name = #n, .value = &annotation__default_options.n, }
-
-/*
- * Keep the entries sorted, they are bsearch'ed
- */
-static struct annotation_config {
-       const char *name;
-       void *value;
-} annotation__configs[] = {
-       ANNOTATION__CFG(hide_src_code),
-       ANNOTATION__CFG(jump_arrows),
-       ANNOTATION__CFG(offset_level),
-       ANNOTATION__CFG(show_linenr),
-       ANNOTATION__CFG(show_nr_jumps),
-       ANNOTATION__CFG(show_nr_samples),
-       ANNOTATION__CFG(show_total_period),
-       ANNOTATION__CFG(use_offset),
-};
-
-#undef ANNOTATION__CFG
-
-static int annotation_config__cmp(const void *name, const void *cfgp)
-{
-       const struct annotation_config *cfg = cfgp;
-
-       return strcmp(name, cfg->name);
-}
-
-static int annotation__config(const char *var, const char *value,
-                           void *data __maybe_unused)
+static int annotation__config(const char *var, const char *value, void *data)
  {
-       struct annotation_config *cfg;
-       const char *name;
+       struct annotation_options *opt = data;
  
         if (!strstarts(var, "annotate."))
                 return 0;
  
-       name = var + 9;
-       cfg = bsearch(name, annotation__configs, ARRAY_SIZE(annotation__configs),
-                     sizeof(struct annotation_config), annotation_config__cmp);
-
-       if (cfg == NULL)
-               pr_debug("%s variable unknown, ignoring...", var);
-       else if (strcmp(var, "annotate.offset_level") == 0) {
-               perf_config_int(cfg->value, name, value);
-
-               if (*(int *)cfg->value > ANNOTATION__MAX_OFFSET_LEVEL)
-                       *(int *)cfg->value = ANNOTATION__MAX_OFFSET_LEVEL;
-               else if (*(int *)cfg->value < ANNOTATION__MIN_OFFSET_LEVEL)
-                       *(int *)cfg->value = ANNOTATION__MIN_OFFSET_LEVEL;
+       if (!strcmp(var, "annotate.offset_level")) {
+               perf_config_u8(&opt->offset_level, "offset_level", value);
+
+               if (opt->offset_level > ANNOTATION__MAX_OFFSET_LEVEL)
+                       opt->offset_level = ANNOTATION__MAX_OFFSET_LEVEL;
+               else if (opt->offset_level < ANNOTATION__MIN_OFFSET_LEVEL)
+                       opt->offset_level = ANNOTATION__MIN_OFFSET_LEVEL;
+       } else if (!strcmp(var, "annotate.hide_src_code")) {
+               opt->hide_src_code = perf_config_bool("hide_src_code", value);
+       } else if (!strcmp(var, "annotate.jump_arrows")) {
+               opt->jump_arrows = perf_config_bool("jump_arrows", value);
+       } else if (!strcmp(var, "annotate.show_linenr")) {
+               opt->show_linenr = perf_config_bool("show_linenr", value);
+       } else if (!strcmp(var, "annotate.show_nr_jumps")) {
+               opt->show_nr_jumps = perf_config_bool("show_nr_jumps", value);
+       } else if (!strcmp(var, "annotate.show_nr_samples")) {
+               symbol_conf.show_nr_samples = perf_config_bool("show_nr_samples",
+                                                               value);
+       } else if (!strcmp(var, "annotate.show_total_period")) {
+               symbol_conf.show_total_period = perf_config_bool("show_total_period",
+                                                               value);
+       } else if (!strcmp(var, "annotate.use_offset")) {
+               opt->use_offset = perf_config_bool("use_offset", value);
         } else {
-               *(bool *)cfg->value = perf_config_bool(name, value);
+               pr_debug("%s variable unknown, ignoring...", var);
         }
+
         return 0;
  }
  
-void annotation_config__init(void)
+void annotation_config__init(struct annotation_options *opt)
  {
-       perf_config(annotation__config, NULL);
-
-       annotation__default_options.show_total_period = symbol_conf.show_total_period;
-       annotation__default_options.show_nr_samples   = symbol_conf.show_nr_samples;
+       perf_config(annotation__config, opt);
  }
  
  static unsigned int parse_percent_type(char *str1, char *str2)
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h

index 455403e8feded864661094b4c5fbb26fe8626492..001258601a371babdb13a1c47d1702a8e9e3e992 100644 (file)
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -83,8 +83,6 @@ struct annotation_options {
              full_path,
              show_linenr,
              show_nr_jumps,
-            show_nr_samples,
-            show_total_period,
              show_minmax_cycle,
              show_asm_raw,
              annotate_src;
@@ -141,7 +139,6 @@ struct annotation_line {
         u64                      cycles;
         u64                      cycles_max;
         u64                      cycles_min;
-       size_t                   privsize;
         char                    *path;
         u32                      idx;
         int                      idx_asm;
@@ -309,7 +306,7 @@ static inline int annotation__cycles_width(struct annotation *notes)
  
  static inline int annotation__pcnt_width(struct annotation *notes)
  {
-       return (notes->options->show_total_period ? 12 : 7) * notes->nr_events;
+       return (symbol_conf.show_total_period ? 12 : 7) * notes->nr_events;
  }
  
  static inline bool annotation_line__filter(struct annotation_line *al, struct annotation *notes)
@@ -352,7 +349,7 @@ struct annotated_source *symbol__hists(struct symbol *sym, int nr_hists);
  void symbol__annotate_zero_histograms(struct symbol *sym);
  
  int symbol__annotate(struct map_symbol *ms,
-                    struct evsel *evsel, size_t privsize,
+                    struct evsel *evsel,
                      struct annotation_options *options,
                      struct arch **parch);
  int symbol__annotate2(struct map_symbol *ms,
@@ -413,7 +410,7 @@ static inline int symbol__tui_annotate(struct map_symbol *ms __maybe_unused,
  }
  #endif
  
-void annotation_config__init(void);
+void annotation_config__init(struct annotation_options *opt);
  
  int annotate_parse_percent_type(const struct option *opt, const char *_str,
                                 int unset);
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c

index eb087e7df6f4bc1c98398a9e98b12dfd64761f0e..3571ce72ca28e7e6ee68adcab72ef3398d88350d 100644 (file)
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -629,8 +629,10 @@ int auxtrace_record__options(struct auxtrace_record *itr,
                              struct evlist *evlist,
                              struct record_opts *opts)
  {
-       if (itr)
+       if (itr) {
+               itr->evlist = evlist;
                 return itr->recording_options(itr, evlist, opts);
+       }
         return 0;
  }
  
@@ -664,6 +666,24 @@ int auxtrace_parse_snapshot_options(struct auxtrace_record *itr,
         return -EINVAL;
  }
  
+int auxtrace_record__read_finish(struct auxtrace_record *itr, int idx)
+{
+       struct evsel *evsel;
+
+       if (!itr->evlist || !itr->pmu)
+               return -EINVAL;
+
+       evlist__for_each_entry(itr->evlist, evsel) {
+               if (evsel->core.attr.type == itr->pmu->type) {
+                       if (evsel->disabled)
+                               return 0;
+                       return perf_evlist__enable_event_idx(itr->evlist, evsel,
+                                                            idx);
+               }
+       }
+       return -EINVAL;
+}
+
  /*
   * Event record size is 16-bit which results in a maximum size of about 64KiB.
   * Allow about 4KiB for the rest of the sample record, to give a maximum
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h

index 749d72cd9c7b0eaaf6f82ea1fc63ee71d6ca8b50..e58ef160b59992602fd89c6c48260b3ceafd5890 100644 (file)
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -29,6 +29,7 @@ struct record_opts;
  struct perf_record_auxtrace_error;
  struct perf_record_auxtrace_info;
  struct events_stats;
+struct perf_pmu;
  
  enum auxtrace_error_type {
         PERF_AUXTRACE_ERROR_ITRACE  = 1,
@@ -322,6 +323,8 @@ struct auxtrace_mmap_params {
   * @read_finish: called after reading from an auxtrace mmap
   * @alignment: alignment (if any) for AUX area data
   * @default_aux_sample_size: default sample size for --aux sample option
+ * @pmu: associated pmu
+ * @evlist: selected events list
   */
  struct auxtrace_record {
         int (*recording_options)(struct auxtrace_record *itr,
@@ -346,6 +349,8 @@ struct auxtrace_record {
         int (*read_finish)(struct auxtrace_record *itr, int idx);
         unsigned int alignment;
         unsigned int default_aux_sample_size;
+       struct perf_pmu *pmu;
+       struct evlist *evlist;
  };
  
  /**
@@ -537,6 +542,7 @@ int auxtrace_record__find_snapshot(struct auxtrace_record *itr, int idx,
                                    struct auxtrace_mmap *mm,
                                    unsigned char *data, u64 *head, u64 *old);
  u64 auxtrace_record__reference(struct auxtrace_record *itr);
+int auxtrace_record__read_finish(struct auxtrace_record *itr, int idx);
  
  int auxtrace_index__auxtrace_event(struct list_head *head, union perf_event *event,
                                    off_t file_offset);
diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c

index c4b030bf6ec2d258da7fecac94d7154b859fb580..fbbb6d640dadcff16ad08083148fdfc9b4598e06 100644 (file)
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -295,7 +295,8 @@ static int block_range_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
         end_line = map__srcline(he->ms.map, bi->sym->start + bi->end,
                                 he->ms.sym);
  
-       if ((start_line != SRCLINE_UNKNOWN) && (end_line != SRCLINE_UNKNOWN)) {
+       if ((strncmp(start_line, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0) &&
+           (strncmp(end_line, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0)) {
                 scnprintf(buf, sizeof(buf), "[%s -> %s]",
                           start_line, end_line);
         } else {
diff --git a/tools/perf/util/config.c b/tools/perf/util/config.c

index 0bc9c4d7fdc5d2239dc6c68dff690e9460d17a2e..ef38eba56ed0cb7629263c5132ec1bf0f7d8349f 100644 (file)
--- a/tools/perf/util/config.c
+++ b/tools/perf/util/config.c
@@ -374,6 +374,18 @@ int perf_config_int(int *dest, const char *name, const char *value)
         return 0;
  }
  
+int perf_config_u8(u8 *dest, const char *name, const char *value)
+{
+       long ret = 0;
+
+       if (!perf_parse_long(value, &ret)) {
+               bad_config(name);
+               return -1;
+       }
+       *dest = ret;
+       return 0;
+}
+
  static int perf_config_bool_or_int(const char *name, const char *value, int *is_bool)
  {
         int ret;
diff --git a/tools/perf/util/config.h b/tools/perf/util/config.h

index bd0a5897c76a5daad5f68f6b561d1a744f029b76..c10b66dde2f35e407ce0b7a5431551c1cc40afdd 100644 (file)
--- a/tools/perf/util/config.h
+++ b/tools/perf/util/config.h
@@ -29,6 +29,7 @@ typedef int (*config_fn_t)(const char *, const char *, void *);
  int perf_default_config(const char *, const char *, void *);
  int perf_config(config_fn_t fn, void *);
  int perf_config_int(int *dest, const char *, const char *);
+int perf_config_u8(u8 *dest, const char *name, const char *value);
  int perf_config_u64(u64 *dest, const char *, const char *);
  int perf_config_bool(const char *, const char *);
  int config_error_nonbool(const char *);
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c

index 6242a9215df7ee325265646dd9ce71acbc0c73b8..4154f944f474a4152d4469c652a040db4b05a78a 100644 (file)
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -343,11 +343,11 @@ static const char *normalize_arch(char *arch)
  
  const char *perf_env__arch(struct perf_env *env)
  {
-       struct utsname uts;
         char *arch_name;
  
         if (!env || !env->arch) { /* Assume local operation */
-               if (uname(&uts) < 0)
+               static struct utsname uts = { .machine[0] = '\0', };
+               if (uts.machine[0] == '\0' && uname(&uts) < 0)
                         return NULL;
                 arch_name = uts.machine;
         } else
diff --git a/tools/perf/util/llvm-utils.c b/tools/perf/util/llvm-utils.c

index eae47c2509eb6fd6023855c816a1200723e97d56..b5af680fc667cc2ebbe5dc6447564cf90f15e7b7 100644 (file)
--- a/tools/perf/util/llvm-utils.c
+++ b/tools/perf/util/llvm-utils.c
@@ -288,6 +288,7 @@ static const char *kinc_fetch_script =
  "obj-y := dummy.o\n"
  "\\$(obj)/%.o: \\$(src)/%.c\n"
  "\t@echo -n \"\\$(NOSTDINC_FLAGS) \\$(LINUXINCLUDE) \\$(EXTRA_CFLAGS)\"\n"
+"\t\\$(CC) -c -o \\$@ \\$<\n"
  "EOF\n"
  "touch $TMPDIR/dummy.c\n"
  "make -s -C $KBUILD_DIR M=$TMPDIR $KBUILD_OPTS dummy.o 2>/dev/null\n"
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c

index c8c5410315e817c2b9533eb8df718dc1bf98915f..fb5c2cd44d3003ac1c0dc1621d8cbb718ab1bdd0 100644 (file)
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -686,6 +686,7 @@ static struct dso *machine__findnew_module_dso(struct machine *machine,
  
                 dso__set_module_info(dso, m, machine);
                 dso__set_long_name(dso, strdup(filename), true);
+               dso->kernel = DSO_TYPE_KERNEL;
         }
  
         dso__get(dso);
@@ -726,9 +727,17 @@ static int machine__process_ksymbol_register(struct machine *machine,
         struct map *map = maps__find(&machine->kmaps, event->ksymbol.addr);
  
         if (!map) {
-               map = dso__new_map(event->ksymbol.name);
-               if (!map)
+               struct dso *dso = dso__new(event->ksymbol.name);
+
+               if (dso) {
+                       dso->kernel = DSO_TYPE_KERNEL;
+                       map = map__new2(0, dso);
+               }
+
+               if (!dso || !map) {
+                       dso__put(dso);
                         return -ENOMEM;
+               }
  
                 map->start = event->ksymbol.addr;
                 map->end = map->start + event->ksymbol.len;
@@ -972,7 +981,6 @@ int machine__create_extra_kernel_map(struct machine *machine,
  
         kmap = map__kmap(map);
  
-       kmap->kmaps = &machine->kmaps;
         strlcpy(kmap->name, xm->name, KMAP_NAME_LEN);
  
         maps__insert(&machine->kmaps, map);
@@ -1082,9 +1090,6 @@ int __weak machine__create_extra_kernel_maps(struct machine *machine __maybe_unu
  static int
  __machine__create_kernel_maps(struct machine *machine, struct dso *kernel)
  {
-       struct kmap *kmap;
-       struct map *map;
-
         /* In case of renewal the kernel map, destroy previous one */
         machine__destroy_kernel_maps(machine);
  
@@ -1093,14 +1098,7 @@ __machine__create_kernel_maps(struct machine *machine, struct dso *kernel)
                 return -1;
  
         machine->vmlinux_map->map_ip = machine->vmlinux_map->unmap_ip = identity__map_ip;
-       map = machine__kernel_map(machine);
-       kmap = map__kmap(map);
-       if (!kmap)
-               return -1;
-
-       kmap->kmaps = &machine->kmaps;
-       maps__insert(&machine->kmaps, map);
-
+       maps__insert(&machine->kmaps, machine->vmlinux_map);
         return 0;
  }
  
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c

index f67960bedebba63663f573f28df9f746aa106910..95428511300d1e371c756846fbdfd3b415e94782 100644 (file)
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -375,8 +375,13 @@ struct symbol *map__find_symbol_by_name(struct map *map, const char *name)
  
  struct map *map__clone(struct map *from)
  {
-       struct map *map = memdup(from, sizeof(*map));
+       size_t size = sizeof(struct map);
+       struct map *map;
+
+       if (from->dso && from->dso->kernel)
+               size += sizeof(struct kmap);
  
+       map = memdup(from, size);
         if (map != NULL) {
                 refcount_set(&map->refcnt, 1);
                 RB_CLEAR_NODE(&map->rb_node);
@@ -426,7 +431,7 @@ int map__fprintf_srcline(struct map *map, u64 addr, const char *prefix,
  
         if (map && map->dso) {
                 char *srcline = map__srcline(map, addr, NULL);
-               if (srcline != SRCLINE_UNKNOWN)
+               if (strncmp(srcline, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0)
                         ret = fprintf(fp, "%s%s", prefix, srcline);
                 free_srcline(srcline);
         }
@@ -538,6 +543,16 @@ void maps__insert(struct maps *maps, struct map *map)
         __maps__insert(maps, map);
         ++maps->nr_maps;
  
+       if (map->dso && map->dso->kernel) {
+               struct kmap *kmap = map__kmap(map);
+
+               if (kmap)
+                       kmap->kmaps = maps;
+               else
+                       pr_err("Internal error: kernel dso with non kernel map\n");
+       }
+
+
         /*
          * If we already performed some search by name, then we need to add the just
          * inserted map and resort.
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c

index c01ba6f8fdad3a3662662267d0941a7bd3b6db04..a14995835d85980f8ac86725ea9f197499d8b6d1 100644 (file)
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -257,21 +257,15 @@ struct tracepoint_path *tracepoint_id_to_path(u64 config)
                                 path = zalloc(sizeof(*path));
                                 if (!path)
                                         return NULL;
-                               path->system = malloc(MAX_EVENT_LENGTH);
-                               if (!path->system) {
+                               if (asprintf(&path->system, "%.*s", MAX_EVENT_LENGTH, sys_dirent->d_name) < 0) {
                                         free(path);
                                         return NULL;
                                 }
-                               path->name = malloc(MAX_EVENT_LENGTH);
-                               if (!path->name) {
+                               if (asprintf(&path->name, "%.*s", MAX_EVENT_LENGTH, evt_dirent->d_name) < 0) {
                                         zfree(&path->system);
                                         free(path);
                                         return NULL;
                                 }
-                               strncpy(path->system, sys_dirent->d_name,
-                                       MAX_EVENT_LENGTH);
-                               strncpy(path->name, evt_dirent->d_name,
-                                       MAX_EVENT_LENGTH);
                                 return path;
                         }
                 }
diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c

index 5003ba4033454f813c420bdeafe44ad8b718751b..0f5fda11675fbd46256e5a46c8c61eb3a121f125 100644 (file)
--- a/tools/perf/util/probe-file.c
+++ b/tools/perf/util/probe-file.c
@@ -301,10 +301,15 @@ int probe_file__get_events(int fd, struct strfilter *filter,
                 p = strchr(ent->s, ':');
                 if ((p && strfilter__compare(filter, p + 1)) ||
                     strfilter__compare(filter, ent->s)) {
-                       strlist__add(plist, ent->s);
+                       ret = strlist__add(plist, ent->s);
+                       if (ret == -ENOMEM) {
+                               pr_err("strlist__add failed with -ENOMEM\n");
+                               goto out;
+                       }
                         ret = 0;
                 }
         }
+out:
         strlist__delete(namelist);
  
         return ret;
@@ -511,7 +516,11 @@ static int probe_cache__load(struct probe_cache *pcache)
                                 ret = -EINVAL;
                                 goto out;
                         }
-                       strlist__add(entry->tevlist, buf);
+                       ret = strlist__add(entry->tevlist, buf);
+                       if (ret == -ENOMEM) {
+                               pr_err("strlist__add failed with -ENOMEM\n");
+                               goto out;
+                       }
                 }
         }
  out:
@@ -672,7 +681,12 @@ int probe_cache__add_entry(struct probe_cache *pcache,
                 command = synthesize_probe_trace_command(&tevs[i]);
                 if (!command)
                         goto out_err;
-               strlist__add(entry->tevlist, command);
+               ret = strlist__add(entry->tevlist, command);
+               if (ret == -ENOMEM) {
+                       pr_err("strlist__add failed with -ENOMEM\n");
+                       goto out_err;
+               }
+
                 free(command);
         }
         list_add_tail(&entry->node, &pcache->entries);
@@ -853,9 +867,15 @@ int probe_cache__scan_sdt(struct probe_cache *pcache, const char *pathname)
                         break;
                 }
  
-               strlist__add(entry->tevlist, buf);
+               ret = strlist__add(entry->tevlist, buf);
+
                 free(buf);
                 entry = NULL;
+
+               if (ret == -ENOMEM) {
+                       pr_err("strlist__add failed with -ENOMEM\n");
+                       break;
+               }
         }
         if (entry) {
                 list_del_init(&entry->node);
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c

index 2c41d47f6f83e6b27d417bd690e0edc4b762961f..90d23cc3c8d492dd515d491166438742bd2b5a70 100644 (file)
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -18,7 +18,6 @@
   * AGGR_NONE: Use matching CPU
   * AGGR_THREAD: Not supported?
   */
-static bool have_frontend_stalled;
  
  struct runtime_stat rt_stat;
  struct stats walltime_nsecs_stats;
@@ -144,7 +143,6 @@ void runtime_stat__exit(struct runtime_stat *st)
  
  void perf_stat__init_shadow_stats(void)
  {
-       have_frontend_stalled = pmu_have_event("cpu", "stalled-cycles-frontend");
         runtime_stat__init(&rt_stat);
  }
  
@@ -853,10 +851,6 @@ void perf_stat__print_shadow_stats(struct perf_stat_config *config,
                         print_metric(config, ctxp, NULL, "%7.2f ",
                                         "stalled cycles per insn",
                                         ratio);
-               } else if (have_frontend_stalled) {
-                       out->new_line(config, ctxp);
-                       print_metric(config, ctxp, NULL, "%7.2f ",
-                                    "stalled cycles per insn", 0);
                 }
         } else if (perf_evsel__match(evsel, HARDWARE, HW_BRANCH_MISSES)) {
                 if (runtime_stat_n(st, STAT_BRANCHES, ctx, cpu) != 0)
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c

index 3b379b1296f1077beb2e1242823eb15cbe4a6729..26bc6a0096ce568bd4e9e70fa910063f1633787b 100644 (file)
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -635,9 +635,12 @@ out:
  static bool symbol__is_idle(const char *name)
  {
         const char * const idle_symbols[] = {
+               "acpi_idle_do_entry",
+               "acpi_processor_ffh_cstate_enter",
                 "arch_cpu_idle",
                 "cpu_idle",
                 "cpu_startup_entry",
+               "idle_cpu",
                 "intel_idle",
                 "default_idle",
                 "native_safe_halt",
@@ -651,13 +654,17 @@ static bool symbol__is_idle(const char *name)
                 NULL
         };
         int i;
+       static struct strlist *idle_symbols_list;
  
-       for (i = 0; idle_symbols[i]; i++) {
-               if (!strcmp(idle_symbols[i], name))
-                       return true;
-       }
+       if (idle_symbols_list)
+               return strlist__has_entry(idle_symbols_list, name);
  
-       return false;
+       idle_symbols_list = strlist__new(NULL, NULL);
+
+       for (i = 0; idle_symbols[i]; i++)
+               strlist__add(idle_symbols_list, idle_symbols[i]);
+
+       return strlist__has_entry(idle_symbols_list, name);
  }
  
  static int map__process_kallsym_symbol(void *arg, const char *name,
@@ -1615,7 +1622,12 @@ int dso__load(struct dso *dso, struct map *map)
                 goto out;
         }
  
-       if (dso->kernel) {
+       kmod = dso->symtab_type == DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE ||
+               dso->symtab_type == DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP ||
+               dso->symtab_type == DSO_BINARY_TYPE__GUEST_KMODULE ||
+               dso->symtab_type == DSO_BINARY_TYPE__GUEST_KMODULE_COMP;
+
+       if (dso->kernel && !kmod) {
                 if (dso->kernel == DSO_TYPE_KERNEL)
                         ret = dso__load_kernel_sym(dso, map);
                 else if (dso->kernel == DSO_TYPE_GUEST_KERNEL)
@@ -1643,12 +1655,6 @@ int dso__load(struct dso *dso, struct map *map)
         if (!name)
                 goto out;
  
-       kmod = dso->symtab_type == DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE ||
-               dso->symtab_type == DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP ||
-               dso->symtab_type == DSO_BINARY_TYPE__GUEST_KMODULE ||
-               dso->symtab_type == DSO_BINARY_TYPE__GUEST_KMODULE_COMP;
-
-
         /*
          * Read the build id if possible. This is required for
          * DSO_BINARY_TYPE__BUILDID_DEBUGINFO to work
diff --git a/tools/testing/kunit/kunit.py b/tools/testing/kunit/kunit.py

index e59eb9e7f9236b5dc872b270b21b072b4886b6ef..180ad1e1b04f91a4f5e7a956436d872a54d5384c 100755 (executable)
--- a/tools/testing/kunit/kunit.py
+++ b/tools/testing/kunit/kunit.py
@@ -24,6 +24,8 @@ KunitResult = namedtuple('KunitResult', ['status','result'])
  
  KunitRequest = namedtuple('KunitRequest', ['raw_output','timeout', 'jobs', 'build_dir', 'defconfig'])
  
+KernelDirectoryPath = sys.argv[0].split('tools/testing/kunit/')[0]
+
  class KunitStatus(Enum):
         SUCCESS = auto()
         CONFIG_FAILURE = auto()
@@ -35,6 +37,13 @@ def create_default_kunitconfig():
                 shutil.copyfile('arch/um/configs/kunit_defconfig',
                                 kunit_kernel.kunitconfig_path)
  
+def get_kernel_root_path():
+       parts = sys.argv[0] if not __file__ else __file__
+       parts = os.path.realpath(parts).split('tools/testing/kunit')
+       if len(parts) != 2:
+               sys.exit(1)
+       return parts[0]
+
  def run_tests(linux: kunit_kernel.LinuxSourceTree,
               request: KunitRequest) -> KunitResult:
         config_start = time.time()
@@ -114,6 +123,9 @@ def main(argv, linux=None):
         cli_args = parser.parse_args(argv)
  
         if cli_args.subcommand == 'run':
+               if get_kernel_root_path():
+                       os.chdir(get_kernel_root_path())
+
                 if cli_args.build_dir:
                         if not os.path.exists(cli_args.build_dir):
                                 os.mkdir(cli_args.build_dir)
diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py

index cc5d844ecca13bfe57f69f13b9c3180834e90a79..d99ae75ef72fa0b7d8193f890875e33c7fb63dce 100644 (file)
--- a/tools/testing/kunit/kunit_kernel.py
+++ b/tools/testing/kunit/kunit_kernel.py
@@ -93,6 +93,20 @@ class LinuxSourceTree(object):
                         return False
                 return True
  
+       def validate_config(self, build_dir):
+               kconfig_path = get_kconfig_path(build_dir)
+               validated_kconfig = kunit_config.Kconfig()
+               validated_kconfig.read_from_file(kconfig_path)
+               if not self._kconfig.is_subset_of(validated_kconfig):
+                       invalid = self._kconfig.entries() - validated_kconfig.entries()
+                       message = 'Provided Kconfig is not contained in validated .config. Following fields found in kunitconfig, ' \
+                                         'but not in .config: %s' % (
+                                       ', '.join([str(e) for e in invalid])
+                       )
+                       logging.error(message)
+                       return False
+               return True
+
         def build_config(self, build_dir):
                 kconfig_path = get_kconfig_path(build_dir)
                 if build_dir and not os.path.exists(build_dir):
@@ -103,12 +117,7 @@ class LinuxSourceTree(object):
                 except ConfigError as e:
                         logging.error(e)
                         return False
-               validated_kconfig = kunit_config.Kconfig()
-               validated_kconfig.read_from_file(kconfig_path)
-               if not self._kconfig.is_subset_of(validated_kconfig):
-                       logging.error('Provided Kconfig is not contained in validated .config!')
-                       return False
-               return True
+               return self.validate_config(build_dir)
  
         def build_reconfig(self, build_dir):
                 """Creates a new .config if it is not a subset of the .kunitconfig."""
@@ -133,12 +142,7 @@ class LinuxSourceTree(object):
                 except (ConfigError, BuildError) as e:
                         logging.error(e)
                         return False
-               used_kconfig = kunit_config.Kconfig()
-               used_kconfig.read_from_file(get_kconfig_path(build_dir))
-               if not self._kconfig.is_subset_of(used_kconfig):
-                       logging.error('Provided Kconfig is not contained in final config!')
-                       return False
-               return True
+               return self.validate_config(build_dir)
  
         def run_kernel(self, args=[], timeout=None, build_dir=''):
                 args.extend(['mem=256M'])
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile

index 63430e2664c2eaf4942f855bead786b05bf8e785..6ec503912bea1e5547235ecdedd5bde9fc87f85d 100644 (file)
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -77,6 +77,12 @@ ifneq ($(SKIP_TARGETS),)
         override TARGETS := $(TMP)
  endif
  
+# User can set FORCE_TARGETS to 1 to require all targets to be successfully
+# built; make will fail if any of the targets cannot be built. If
+# FORCE_TARGETS is not set (the default), make will succeed if at least one
+# of the targets gets built.
+FORCE_TARGETS ?=
+
  # Clear LDFLAGS and MAKEFLAGS if called from main
  # Makefile to avoid test build failures when test
  # Makefile doesn't have explicit build rules.
@@ -151,7 +157,8 @@ all: khdr
         for TARGET in $(TARGETS); do                            \
                 BUILD_TARGET=$$BUILD/$$TARGET;                  \
                 mkdir $$BUILD_TARGET  -p;                       \
-               $(MAKE) OUTPUT=$$BUILD_TARGET -C $$TARGET;      \
+               $(MAKE) OUTPUT=$$BUILD_TARGET -C $$TARGET       \
+                               $(if $(FORCE_TARGETS),|| exit); \
                 ret=$$((ret * $$?));                            \
         done; exit $$ret;
  
@@ -205,7 +212,8 @@ ifdef INSTALL_PATH
         @ret=1; \
         for TARGET in $(TARGETS); do \
                 BUILD_TARGET=$$BUILD/$$TARGET;  \
-               $(MAKE) OUTPUT=$$BUILD_TARGET -C $$TARGET INSTALL_PATH=$(INSTALL_PATH)/$$TARGET install; \
+               $(MAKE) OUTPUT=$$BUILD_TARGET -C $$TARGET INSTALL_PATH=$(INSTALL_PATH)/$$TARGET install \
+                               $(if $(FORCE_TARGETS),|| exit); \
                 ret=$$((ret * $$?));            \
         done; exit $$ret;
  
diff --git a/tools/testing/selftests/bpf/prog_tests/select_reuseport.c b/tools/testing/selftests/bpf/prog_tests/select_reuseport.c

index 098bcae5f827e949610d702349d386a89a5f6ddd..0800036ed6547d26efc5b89d7fe109f2fcf7c604 100644 (file)
--- a/tools/testing/selftests/bpf/prog_tests/select_reuseport.c
+++ b/tools/testing/selftests/bpf/prog_tests/select_reuseport.c
@@ -506,8 +506,10 @@ static void test_syncookie(int type, sa_family_t family)
                 .pass_on_failure = 0,
         };
  
-       if (type != SOCK_STREAM)
+       if (type != SOCK_STREAM) {
+               test__skip();
                 return;
+       }
  
         /*
          * +1 for TCP-SYN and
@@ -822,8 +824,10 @@ void test_select_reuseport(void)
                 goto out;
  
         saved_tcp_fo = read_int_sysctl(TCP_FO_SYSCTL);
+       if (saved_tcp_fo < 0)
+               goto out;
         saved_tcp_syncookie = read_int_sysctl(TCP_SYNCOOKIE_SYSCTL);
-       if (saved_tcp_syncookie < 0 || saved_tcp_syncookie < 0)
+       if (saved_tcp_syncookie < 0)
                 goto out;
  
         if (enable_fastopen())
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c

index 07f5b462c2ef5bc11aee156e442b3ac565272d89..aa43e0bd210c3182fe7ebce2f62c5f2def0548de 100644 (file)
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c
@@ -3,6 +3,11 @@
  
  #include "test_progs.h"
  
+#define TCP_REPAIR             19      /* TCP sock is under repair right now */
+
+#define TCP_REPAIR_ON          1
+#define TCP_REPAIR_OFF_NO_WP   -1      /* Turn off without window probes */
+
  static int connected_socket_v4(void)
  {
         struct sockaddr_in addr = {
diff --git a/tools/testing/selftests/ftrace/Makefile b/tools/testing/selftests/ftrace/Makefile

index cd1f5b3a777461b89b25c9783d54aeadcc4d66d0..d6e106fbce11c7706e061cf23a920b8d0cb03459 100644 (file)
--- a/tools/testing/selftests/ftrace/Makefile
+++ b/tools/testing/selftests/ftrace/Makefile
@@ -2,7 +2,7 @@
  all:
  
  TEST_PROGS := ftracetest
-TEST_FILES := test.d
+TEST_FILES := test.d settings
  EXTRA_CLEAN := $(OUTPUT)/logs/*
  
  include ../lib.mk
diff --git a/tools/testing/selftests/ftrace/test.d/ftrace/func-filter-pid.tc b/tools/testing/selftests/ftrace/test.d/ftrace/func-filter-pid.tc

index 64cfcc75e3c10dc8016f2678f3d06f14863c9f81..f2ee1e889e1350e2c77c0c33b713de96afabc86d 100644 (file)
--- a/tools/testing/selftests/ftrace/test.d/ftrace/func-filter-pid.tc
+++ b/tools/testing/selftests/ftrace/test.d/ftrace/func-filter-pid.tc
@@ -1,6 +1,7 @@
  #!/bin/sh
  # SPDX-License-Identifier: GPL-2.0
  # description: ftrace - function pid filters
+# flags: instance
  
  # Make sure that function pid matching filter works.
  # Also test it on an instance directory
@@ -96,13 +97,6 @@ do_test() {
  }
  
  do_test
-
-mkdir instances/foo
-cd instances/foo
-do_test
-cd ../../
-rmdir instances/foo
-
  do_reset
  
  exit 0
diff --git a/tools/testing/selftests/futex/functional/Makefile b/tools/testing/selftests/futex/functional/Makefile

index 30996306cabcfe89a47977643e529b122893bb7e..23207829ec752b52a56577a679f5baf3f3f51a46 100644 (file)
--- a/tools/testing/selftests/futex/functional/Makefile
+++ b/tools/testing/selftests/futex/functional/Makefile
@@ -1,7 +1,7 @@
  # SPDX-License-Identifier: GPL-2.0
  INCLUDES := -I../include -I../../
  CFLAGS := $(CFLAGS) -g -O2 -Wall -D_GNU_SOURCE -pthread $(INCLUDES)
-LDFLAGS := $(LDFLAGS) -pthread -lrt
+LDLIBS := -lpthread -lrt
  
  HEADERS := \
         ../include/futextest.h \
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile

index 67abc1dd50ee6aeaf082445a9fe78ad103dacf90..d91c53b726e60c3f36698930c4e59a4388da74c5 100644 (file)
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -8,7 +8,7 @@ KSFT_KHDR_INSTALL := 1
  UNAME_M := $(shell uname -m)
  
  LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/sparsebit.c
-LIBKVM_x86_64 = lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/ucall.c
+LIBKVM_x86_64 = lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c
  LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c
  LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c
  
@@ -26,6 +26,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/vmx_dirty_log_test
  TEST_GEN_PROGS_x86_64 += x86_64/vmx_set_nested_state_test
  TEST_GEN_PROGS_x86_64 += x86_64/vmx_tsc_adjust_test
  TEST_GEN_PROGS_x86_64 += x86_64/xss_msr_test
+TEST_GEN_PROGS_x86_64 += x86_64/svm_vmcall_test
  TEST_GEN_PROGS_x86_64 += clear_dirty_log_test
  TEST_GEN_PROGS_x86_64 += dirty_log_test
  TEST_GEN_PROGS_x86_64 += kvm_create_max_vcpus
diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h

index aa6451b3f740baa24a2afb90e85365af99f02e81..7428513a4c687f0061d0feb310f6af5abc8dcc9d 100644 (file)
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -36,24 +36,24 @@
  #define X86_CR4_SMAP           (1ul << 21)
  #define X86_CR4_PKE            (1ul << 22)
  
-/* The enum values match the intruction encoding of each register */
-enum x86_register {
-       RAX = 0,
-       RCX,
-       RDX,
-       RBX,
-       RSP,
-       RBP,
-       RSI,
-       RDI,
-       R8,
-       R9,
-       R10,
-       R11,
-       R12,
-       R13,
-       R14,
-       R15,
+/* General Registers in 64-Bit Mode */
+struct gpr64_regs {
+       u64 rax;
+       u64 rcx;
+       u64 rdx;
+       u64 rbx;
+       u64 rsp;
+       u64 rbp;
+       u64 rsi;
+       u64 rdi;
+       u64 r8;
+       u64 r9;
+       u64 r10;
+       u64 r11;
+       u64 r12;
+       u64 r13;
+       u64 r14;
+       u64 r15;
  };
  
  struct desc64 {
@@ -220,20 +220,20 @@ static inline void set_cr4(uint64_t val)
         __asm__ __volatile__("mov %0, %%cr4" : : "r" (val) : "memory");
  }
  
-static inline uint64_t get_gdt_base(void)
+static inline struct desc_ptr get_gdt(void)
  {
         struct desc_ptr gdt;
         __asm__ __volatile__("sgdt %[gdt]"
                              : /* output */ [gdt]"=m"(gdt));
-       return gdt.address;
+       return gdt;
  }
  
-static inline uint64_t get_idt_base(void)
+static inline struct desc_ptr get_idt(void)
  {
         struct desc_ptr idt;
         __asm__ __volatile__("sidt %[idt]"
                              : /* output */ [idt]"=m"(idt));
-       return idt.address;
+       return idt;
  }
  
  #define SET_XMM(__var, __xmm) \
diff --git a/tools/testing/selftests/kvm/include/x86_64/svm.h b/tools/testing/selftests/kvm/include/x86_64/svm.h

new file mode 100644 (file)

index 0000000..f4ea235
--- /dev/null
+++ b/tools/testing/selftests/kvm/include/x86_64/svm.h
@@ -0,0 +1,297 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * tools/testing/selftests/kvm/include/x86_64/svm.h
+ * This is a copy of arch/x86/include/asm/svm.h
+ *
+ */
+
+#ifndef SELFTEST_KVM_SVM_H
+#define SELFTEST_KVM_SVM_H
+
+enum {
+       INTERCEPT_INTR,
+       INTERCEPT_NMI,
+       INTERCEPT_SMI,
+       INTERCEPT_INIT,
+       INTERCEPT_VINTR,
+       INTERCEPT_SELECTIVE_CR0,
+       INTERCEPT_STORE_IDTR,
+       INTERCEPT_STORE_GDTR,
+       INTERCEPT_STORE_LDTR,
+       INTERCEPT_STORE_TR,
+       INTERCEPT_LOAD_IDTR,
+       INTERCEPT_LOAD_GDTR,
+       INTERCEPT_LOAD_LDTR,
+       INTERCEPT_LOAD_TR,
+       INTERCEPT_RDTSC,
+       INTERCEPT_RDPMC,
+       INTERCEPT_PUSHF,
+       INTERCEPT_POPF,
+       INTERCEPT_CPUID,
+       INTERCEPT_RSM,
+       INTERCEPT_IRET,
+       INTERCEPT_INTn,
+       INTERCEPT_INVD,
+       INTERCEPT_PAUSE,
+       INTERCEPT_HLT,
+       INTERCEPT_INVLPG,
+       INTERCEPT_INVLPGA,
+       INTERCEPT_IOIO_PROT,
+       INTERCEPT_MSR_PROT,
+       INTERCEPT_TASK_SWITCH,
+       INTERCEPT_FERR_FREEZE,
+       INTERCEPT_SHUTDOWN,
+       INTERCEPT_VMRUN,
+       INTERCEPT_VMMCALL,
+       INTERCEPT_VMLOAD,
+       INTERCEPT_VMSAVE,
+       INTERCEPT_STGI,
+       INTERCEPT_CLGI,
+       INTERCEPT_SKINIT,
+       INTERCEPT_RDTSCP,
+       INTERCEPT_ICEBP,
+       INTERCEPT_WBINVD,
+       INTERCEPT_MONITOR,
+       INTERCEPT_MWAIT,
+       INTERCEPT_MWAIT_COND,
+       INTERCEPT_XSETBV,
+       INTERCEPT_RDPRU,
+};
+
+
+struct __attribute__ ((__packed__)) vmcb_control_area {
+       u32 intercept_cr;
+       u32 intercept_dr;
+       u32 intercept_exceptions;
+       u64 intercept;
+       u8 reserved_1[40];
+       u16 pause_filter_thresh;
+       u16 pause_filter_count;
+       u64 iopm_base_pa;
+       u64 msrpm_base_pa;
+       u64 tsc_offset;
+       u32 asid;
+       u8 tlb_ctl;
+       u8 reserved_2[3];
+       u32 int_ctl;
+       u32 int_vector;
+       u32 int_state;
+       u8 reserved_3[4];
+       u32 exit_code;
+       u32 exit_code_hi;
+       u64 exit_info_1;
+       u64 exit_info_2;
+       u32 exit_int_info;
+       u32 exit_int_info_err;
+       u64 nested_ctl;
+       u64 avic_vapic_bar;
+       u8 reserved_4[8];
+       u32 event_inj;
+       u32 event_inj_err;
+       u64 nested_cr3;
+       u64 virt_ext;
+       u32 clean;
+       u32 reserved_5;
+       u64 next_rip;
+       u8 insn_len;
+       u8 insn_bytes[15];
+       u64 avic_backing_page;  /* Offset 0xe0 */
+       u8 reserved_6[8];       /* Offset 0xe8 */
+       u64 avic_logical_id;    /* Offset 0xf0 */
+       u64 avic_physical_id;   /* Offset 0xf8 */
+       u8 reserved_7[768];
+};
+
+
+#define TLB_CONTROL_DO_NOTHING 0
+#define TLB_CONTROL_FLUSH_ALL_ASID 1
+#define TLB_CONTROL_FLUSH_ASID 3
+#define TLB_CONTROL_FLUSH_ASID_LOCAL 7
+
+#define V_TPR_MASK 0x0f
+
+#define V_IRQ_SHIFT 8
+#define V_IRQ_MASK (1 << V_IRQ_SHIFT)
+
+#define V_GIF_SHIFT 9
+#define V_GIF_MASK (1 << V_GIF_SHIFT)
+
+#define V_INTR_PRIO_SHIFT 16
+#define V_INTR_PRIO_MASK (0x0f << V_INTR_PRIO_SHIFT)
+
+#define V_IGN_TPR_SHIFT 20
+#define V_IGN_TPR_MASK (1 << V_IGN_TPR_SHIFT)
+
+#define V_INTR_MASKING_SHIFT 24
+#define V_INTR_MASKING_MASK (1 << V_INTR_MASKING_SHIFT)
+
+#define V_GIF_ENABLE_SHIFT 25
+#define V_GIF_ENABLE_MASK (1 << V_GIF_ENABLE_SHIFT)
+
+#define AVIC_ENABLE_SHIFT 31
+#define AVIC_ENABLE_MASK (1 << AVIC_ENABLE_SHIFT)
+
+#define LBR_CTL_ENABLE_MASK BIT_ULL(0)
+#define VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK BIT_ULL(1)
+
+#define SVM_INTERRUPT_SHADOW_MASK 1
+
+#define SVM_IOIO_STR_SHIFT 2
+#define SVM_IOIO_REP_SHIFT 3
+#define SVM_IOIO_SIZE_SHIFT 4
+#define SVM_IOIO_ASIZE_SHIFT 7
+
+#define SVM_IOIO_TYPE_MASK 1
+#define SVM_IOIO_STR_MASK (1 << SVM_IOIO_STR_SHIFT)
+#define SVM_IOIO_REP_MASK (1 << SVM_IOIO_REP_SHIFT)
+#define SVM_IOIO_SIZE_MASK (7 << SVM_IOIO_SIZE_SHIFT)
+#define SVM_IOIO_ASIZE_MASK (7 << SVM_IOIO_ASIZE_SHIFT)
+
+#define SVM_VM_CR_VALID_MASK   0x001fULL
+#define SVM_VM_CR_SVM_LOCK_MASK 0x0008ULL
+#define SVM_VM_CR_SVM_DIS_MASK  0x0010ULL
+
+#define SVM_NESTED_CTL_NP_ENABLE       BIT(0)
+#define SVM_NESTED_CTL_SEV_ENABLE      BIT(1)
+
+struct __attribute__ ((__packed__)) vmcb_seg {
+       u16 selector;
+       u16 attrib;
+       u32 limit;
+       u64 base;
+};
+
+struct __attribute__ ((__packed__)) vmcb_save_area {
+       struct vmcb_seg es;
+       struct vmcb_seg cs;
+       struct vmcb_seg ss;
+       struct vmcb_seg ds;
+       struct vmcb_seg fs;
+       struct vmcb_seg gs;
+       struct vmcb_seg gdtr;
+       struct vmcb_seg ldtr;
+       struct vmcb_seg idtr;
+       struct vmcb_seg tr;
+       u8 reserved_1[43];
+       u8 cpl;
+       u8 reserved_2[4];
+       u64 efer;
+       u8 reserved_3[112];
+       u64 cr4;
+       u64 cr3;
+       u64 cr0;
+       u64 dr7;
+       u64 dr6;
+       u64 rflags;
+       u64 rip;
+       u8 reserved_4[88];
+       u64 rsp;
+       u8 reserved_5[24];
+       u64 rax;
+       u64 star;
+       u64 lstar;
+       u64 cstar;
+       u64 sfmask;
+       u64 kernel_gs_base;
+       u64 sysenter_cs;
+       u64 sysenter_esp;
+       u64 sysenter_eip;
+       u64 cr2;
+       u8 reserved_6[32];
+       u64 g_pat;
+       u64 dbgctl;
+       u64 br_from;
+       u64 br_to;
+       u64 last_excp_from;
+       u64 last_excp_to;
+};
+
+struct __attribute__ ((__packed__)) vmcb {
+       struct vmcb_control_area control;
+       struct vmcb_save_area save;
+};
+
+#define SVM_CPUID_FUNC 0x8000000a
+
+#define SVM_VM_CR_SVM_DISABLE 4
+
+#define SVM_SELECTOR_S_SHIFT 4
+#define SVM_SELECTOR_DPL_SHIFT 5
+#define SVM_SELECTOR_P_SHIFT 7
+#define SVM_SELECTOR_AVL_SHIFT 8
+#define SVM_SELECTOR_L_SHIFT 9
+#define SVM_SELECTOR_DB_SHIFT 10
+#define SVM_SELECTOR_G_SHIFT 11
+
+#define SVM_SELECTOR_TYPE_MASK (0xf)
+#define SVM_SELECTOR_S_MASK (1 << SVM_SELECTOR_S_SHIFT)
+#define SVM_SELECTOR_DPL_MASK (3 << SVM_SELECTOR_DPL_SHIFT)
+#define SVM_SELECTOR_P_MASK (1 << SVM_SELECTOR_P_SHIFT)
+#define SVM_SELECTOR_AVL_MASK (1 << SVM_SELECTOR_AVL_SHIFT)
+#define SVM_SELECTOR_L_MASK (1 << SVM_SELECTOR_L_SHIFT)
+#define SVM_SELECTOR_DB_MASK (1 << SVM_SELECTOR_DB_SHIFT)
+#define SVM_SELECTOR_G_MASK (1 << SVM_SELECTOR_G_SHIFT)
+
+#define SVM_SELECTOR_WRITE_MASK (1 << 1)
+#define SVM_SELECTOR_READ_MASK SVM_SELECTOR_WRITE_MASK
+#define SVM_SELECTOR_CODE_MASK (1 << 3)
+
+#define INTERCEPT_CR0_READ     0
+#define INTERCEPT_CR3_READ     3
+#define INTERCEPT_CR4_READ     4
+#define INTERCEPT_CR8_READ     8
+#define INTERCEPT_CR0_WRITE    (16 + 0)
+#define INTERCEPT_CR3_WRITE    (16 + 3)
+#define INTERCEPT_CR4_WRITE    (16 + 4)
+#define INTERCEPT_CR8_WRITE    (16 + 8)
+
+#define INTERCEPT_DR0_READ     0
+#define INTERCEPT_DR1_READ     1
+#define INTERCEPT_DR2_READ     2
+#define INTERCEPT_DR3_READ     3
+#define INTERCEPT_DR4_READ     4
+#define INTERCEPT_DR5_READ     5
+#define INTERCEPT_DR6_READ     6
+#define INTERCEPT_DR7_READ     7
+#define INTERCEPT_DR0_WRITE    (16 + 0)
+#define INTERCEPT_DR1_WRITE    (16 + 1)
+#define INTERCEPT_DR2_WRITE    (16 + 2)
+#define INTERCEPT_DR3_WRITE    (16 + 3)
+#define INTERCEPT_DR4_WRITE    (16 + 4)
+#define INTERCEPT_DR5_WRITE    (16 + 5)
+#define INTERCEPT_DR6_WRITE    (16 + 6)
+#define INTERCEPT_DR7_WRITE    (16 + 7)
+
+#define SVM_EVTINJ_VEC_MASK 0xff
+
+#define SVM_EVTINJ_TYPE_SHIFT 8
+#define SVM_EVTINJ_TYPE_MASK (7 << SVM_EVTINJ_TYPE_SHIFT)
+
+#define SVM_EVTINJ_TYPE_INTR (0 << SVM_EVTINJ_TYPE_SHIFT)
+#define SVM_EVTINJ_TYPE_NMI (2 << SVM_EVTINJ_TYPE_SHIFT)
+#define SVM_EVTINJ_TYPE_EXEPT (3 << SVM_EVTINJ_TYPE_SHIFT)
+#define SVM_EVTINJ_TYPE_SOFT (4 << SVM_EVTINJ_TYPE_SHIFT)
+
+#define SVM_EVTINJ_VALID (1 << 31)
+#define SVM_EVTINJ_VALID_ERR (1 << 11)
+
+#define SVM_EXITINTINFO_VEC_MASK SVM_EVTINJ_VEC_MASK
+#define SVM_EXITINTINFO_TYPE_MASK SVM_EVTINJ_TYPE_MASK
+
+#define        SVM_EXITINTINFO_TYPE_INTR SVM_EVTINJ_TYPE_INTR
+#define        SVM_EXITINTINFO_TYPE_NMI SVM_EVTINJ_TYPE_NMI
+#define        SVM_EXITINTINFO_TYPE_EXEPT SVM_EVTINJ_TYPE_EXEPT
+#define        SVM_EXITINTINFO_TYPE_SOFT SVM_EVTINJ_TYPE_SOFT
+
+#define SVM_EXITINTINFO_VALID SVM_EVTINJ_VALID
+#define SVM_EXITINTINFO_VALID_ERR SVM_EVTINJ_VALID_ERR
+
+#define SVM_EXITINFOSHIFT_TS_REASON_IRET 36
+#define SVM_EXITINFOSHIFT_TS_REASON_JMP 38
+#define SVM_EXITINFOSHIFT_TS_HAS_ERROR_CODE 44
+
+#define SVM_EXITINFO_REG_MASK 0x0F
+
+#define SVM_CR0_SELECTIVE_MASK (X86_CR0_TS | X86_CR0_MP)
+
+#endif /* SELFTEST_KVM_SVM_H */
diff --git a/tools/testing/selftests/kvm/include/x86_64/svm_util.h b/tools/testing/selftests/kvm/include/x86_64/svm_util.h

new file mode 100644 (file)

index 0000000..cd03791
--- /dev/null
+++ b/tools/testing/selftests/kvm/include/x86_64/svm_util.h
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * tools/testing/selftests/kvm/include/x86_64/svm_utils.h
+ * Header for nested SVM testing
+ *
+ * Copyright (C) 2020, Red Hat, Inc.
+ */
+
+#ifndef SELFTEST_KVM_SVM_UTILS_H
+#define SELFTEST_KVM_SVM_UTILS_H
+
+#include <stdint.h>
+#include "svm.h"
+#include "processor.h"
+
+#define CPUID_SVM_BIT          2
+#define CPUID_SVM              BIT_ULL(CPUID_SVM_BIT)
+
+#define SVM_EXIT_VMMCALL       0x081
+
+struct svm_test_data {
+       /* VMCB */
+       struct vmcb *vmcb; /* gva */
+       void *vmcb_hva;
+       uint64_t vmcb_gpa;
+
+       /* host state-save area */
+       struct vmcb_save_area *save_area; /* gva */
+       void *save_area_hva;
+       uint64_t save_area_gpa;
+};
+
+struct svm_test_data *vcpu_alloc_svm(struct kvm_vm *vm, vm_vaddr_t *p_svm_gva);
+void generic_svm_setup(struct svm_test_data *svm, void *guest_rip, void *guest_rsp);
+void run_guest(struct vmcb *vmcb, uint64_t vmcb_gpa);
+void nested_svm_check_supported(void);
+
+#endif /* SELFTEST_KVM_SVM_UTILS_H */
diff --git a/tools/testing/selftests/kvm/lib/x86_64/svm.c b/tools/testing/selftests/kvm/lib/x86_64/svm.c

new file mode 100644 (file)

index 0000000..6e05a8f
--- /dev/null
+++ b/tools/testing/selftests/kvm/lib/x86_64/svm.c
@@ -0,0 +1,161 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * tools/testing/selftests/kvm/lib/x86_64/svm.c
+ * Helpers used for nested SVM testing
+ * Largely inspired from KVM unit test svm.c
+ *
+ * Copyright (C) 2020, Red Hat, Inc.
+ */
+
+#include "test_util.h"
+#include "kvm_util.h"
+#include "../kvm_util_internal.h"
+#include "processor.h"
+#include "svm_util.h"
+
+struct gpr64_regs guest_regs;
+u64 rflags;
+
+/* Allocate memory regions for nested SVM tests.
+ *
+ * Input Args:
+ *   vm - The VM to allocate guest-virtual addresses in.
+ *
+ * Output Args:
+ *   p_svm_gva - The guest virtual address for the struct svm_test_data.
+ *
+ * Return:
+ *   Pointer to structure with the addresses of the SVM areas.
+ */
+struct svm_test_data *
+vcpu_alloc_svm(struct kvm_vm *vm, vm_vaddr_t *p_svm_gva)
+{
+       vm_vaddr_t svm_gva = vm_vaddr_alloc(vm, getpagesize(),
+                                           0x10000, 0, 0);
+       struct svm_test_data *svm = addr_gva2hva(vm, svm_gva);
+
+       svm->vmcb = (void *)vm_vaddr_alloc(vm, getpagesize(),
+                                          0x10000, 0, 0);
+       svm->vmcb_hva = addr_gva2hva(vm, (uintptr_t)svm->vmcb);
+       svm->vmcb_gpa = addr_gva2gpa(vm, (uintptr_t)svm->vmcb);
+
+       svm->save_area = (void *)vm_vaddr_alloc(vm, getpagesize(),
+                                               0x10000, 0, 0);
+       svm->save_area_hva = addr_gva2hva(vm, (uintptr_t)svm->save_area);
+       svm->save_area_gpa = addr_gva2gpa(vm, (uintptr_t)svm->save_area);
+
+       *p_svm_gva = svm_gva;
+       return svm;
+}
+
+static void vmcb_set_seg(struct vmcb_seg *seg, u16 selector,
+                        u64 base, u32 limit, u32 attr)
+{
+       seg->selector = selector;
+       seg->attrib = attr;
+       seg->limit = limit;
+       seg->base = base;
+}
+
+void generic_svm_setup(struct svm_test_data *svm, void *guest_rip, void *guest_rsp)
+{
+       struct vmcb *vmcb = svm->vmcb;
+       uint64_t vmcb_gpa = svm->vmcb_gpa;
+       struct vmcb_save_area *save = &vmcb->save;
+       struct vmcb_control_area *ctrl = &vmcb->control;
+       u32 data_seg_attr = 3 | SVM_SELECTOR_S_MASK | SVM_SELECTOR_P_MASK
+             | SVM_SELECTOR_DB_MASK | SVM_SELECTOR_G_MASK;
+       u32 code_seg_attr = 9 | SVM_SELECTOR_S_MASK | SVM_SELECTOR_P_MASK
+               | SVM_SELECTOR_L_MASK | SVM_SELECTOR_G_MASK;
+       uint64_t efer;
+
+       efer = rdmsr(MSR_EFER);
+       wrmsr(MSR_EFER, efer | EFER_SVME);
+       wrmsr(MSR_VM_HSAVE_PA, svm->save_area_gpa);
+
+       memset(vmcb, 0, sizeof(*vmcb));
+       asm volatile ("vmsave\n\t" : : "a" (vmcb_gpa) : "memory");
+       vmcb_set_seg(&save->es, get_es(), 0, -1U, data_seg_attr);
+       vmcb_set_seg(&save->cs, get_cs(), 0, -1U, code_seg_attr);
+       vmcb_set_seg(&save->ss, get_ss(), 0, -1U, data_seg_attr);
+       vmcb_set_seg(&save->ds, get_ds(), 0, -1U, data_seg_attr);
+       vmcb_set_seg(&save->gdtr, 0, get_gdt().address, get_gdt().size, 0);
+       vmcb_set_seg(&save->idtr, 0, get_idt().address, get_idt().size, 0);
+
+       ctrl->asid = 1;
+       save->cpl = 0;
+       save->efer = rdmsr(MSR_EFER);
+       asm volatile ("mov %%cr4, %0" : "=r"(save->cr4) : : "memory");
+       asm volatile ("mov %%cr3, %0" : "=r"(save->cr3) : : "memory");
+       asm volatile ("mov %%cr0, %0" : "=r"(save->cr0) : : "memory");
+       asm volatile ("mov %%dr7, %0" : "=r"(save->dr7) : : "memory");
+       asm volatile ("mov %%dr6, %0" : "=r"(save->dr6) : : "memory");
+       asm volatile ("mov %%cr2, %0" : "=r"(save->cr2) : : "memory");
+       save->g_pat = rdmsr(MSR_IA32_CR_PAT);
+       save->dbgctl = rdmsr(MSR_IA32_DEBUGCTLMSR);
+       ctrl->intercept = (1ULL << INTERCEPT_VMRUN) |
+                               (1ULL << INTERCEPT_VMMCALL);
+
+       vmcb->save.rip = (u64)guest_rip;
+       vmcb->save.rsp = (u64)guest_rsp;
+       guest_regs.rdi = (u64)svm;
+}
+
+/*
+ * save/restore 64-bit general registers except rax, rip, rsp
+ * which are directly handed through the VMCB guest processor state
+ */
+#define SAVE_GPR_C                             \
+       "xchg %%rbx, guest_regs+0x20\n\t"       \
+       "xchg %%rcx, guest_regs+0x10\n\t"       \
+       "xchg %%rdx, guest_regs+0x18\n\t"       \
+       "xchg %%rbp, guest_regs+0x30\n\t"       \
+       "xchg %%rsi, guest_regs+0x38\n\t"       \
+       "xchg %%rdi, guest_regs+0x40\n\t"       \
+       "xchg %%r8,  guest_regs+0x48\n\t"       \
+       "xchg %%r9,  guest_regs+0x50\n\t"       \
+       "xchg %%r10, guest_regs+0x58\n\t"       \
+       "xchg %%r11, guest_regs+0x60\n\t"       \
+       "xchg %%r12, guest_regs+0x68\n\t"       \
+       "xchg %%r13, guest_regs+0x70\n\t"       \
+       "xchg %%r14, guest_regs+0x78\n\t"       \
+       "xchg %%r15, guest_regs+0x80\n\t"
+
+#define LOAD_GPR_C      SAVE_GPR_C
+
+/*
+ * selftests do not use interrupts so we dropped clgi/sti/cli/stgi
+ * for now. registers involved in LOAD/SAVE_GPR_C are eventually
+ * unmodified so they do not need to be in the clobber list.
+ */
+void run_guest(struct vmcb *vmcb, uint64_t vmcb_gpa)
+{
+       asm volatile (
+               "vmload\n\t"
+               "mov rflags, %%r15\n\t" // rflags
+               "mov %%r15, 0x170(%[vmcb])\n\t"
+               "mov guest_regs, %%r15\n\t"     // rax
+               "mov %%r15, 0x1f8(%[vmcb])\n\t"
+               LOAD_GPR_C
+               "vmrun\n\t"
+               SAVE_GPR_C
+               "mov 0x170(%[vmcb]), %%r15\n\t" // rflags
+               "mov %%r15, rflags\n\t"
+               "mov 0x1f8(%[vmcb]), %%r15\n\t" // rax
+               "mov %%r15, guest_regs\n\t"
+               "vmsave\n\t"
+               : : [vmcb] "r" (vmcb), [vmcb_gpa] "a" (vmcb_gpa)
+               : "r15", "memory");
+}
+
+void nested_svm_check_supported(void)
+{
+       struct kvm_cpuid_entry2 *entry =
+               kvm_get_supported_cpuid_entry(0x80000001);
+
+       if (!(entry->ecx & CPUID_SVM)) {
+               fprintf(stderr, "nested SVM not enabled, skipping test\n");
+               exit(KSFT_SKIP);
+       }
+}
+
diff --git a/tools/testing/selftests/kvm/lib/x86_64/vmx.c b/tools/testing/selftests/kvm/lib/x86_64/vmx.c

index 85064baf5e97c2fe363c4f662ca2e0f274e4fa21..7aaa99ca4dbc3a34aabfbcda41239742be46f3b6 100644 (file)
--- a/tools/testing/selftests/kvm/lib/x86_64/vmx.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/vmx.c
@@ -288,9 +288,9 @@ static inline void init_vmcs_host_state(void)
         vmwrite(HOST_FS_BASE, rdmsr(MSR_FS_BASE));
         vmwrite(HOST_GS_BASE, rdmsr(MSR_GS_BASE));
         vmwrite(HOST_TR_BASE,
-               get_desc64_base((struct desc64 *)(get_gdt_base() + get_tr())));
-       vmwrite(HOST_GDTR_BASE, get_gdt_base());
-       vmwrite(HOST_IDTR_BASE, get_idt_base());
+               get_desc64_base((struct desc64 *)(get_gdt().address + get_tr())));
+       vmwrite(HOST_GDTR_BASE, get_gdt().address);
+       vmwrite(HOST_IDTR_BASE, get_idt().address);
         vmwrite(HOST_IA32_SYSENTER_ESP, rdmsr(MSR_IA32_SYSENTER_ESP));
         vmwrite(HOST_IA32_SYSENTER_EIP, rdmsr(MSR_IA32_SYSENTER_EIP));
  }
diff --git a/tools/testing/selftests/kvm/x86_64/svm_vmcall_test.c b/tools/testing/selftests/kvm/x86_64/svm_vmcall_test.c

new file mode 100644 (file)

index 0000000..e280f68
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/svm_vmcall_test.c
@@ -0,0 +1,79 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * svm_vmcall_test
+ *
+ * Copyright (C) 2020, Red Hat, Inc.
+ *
+ * Nested SVM testing: VMCALL
+ */
+
+#include "test_util.h"
+#include "kvm_util.h"
+#include "processor.h"
+#include "svm_util.h"
+
+#define VCPU_ID                5
+
+static struct kvm_vm *vm;
+
+static void l2_guest_code(struct svm_test_data *svm)
+{
+       __asm__ __volatile__("vmcall");
+}
+
+static void l1_guest_code(struct svm_test_data *svm)
+{
+       #define L2_GUEST_STACK_SIZE 64
+       unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
+       struct vmcb *vmcb = svm->vmcb;
+
+       /* Prepare for L2 execution. */
+       generic_svm_setup(svm, l2_guest_code,
+                         &l2_guest_stack[L2_GUEST_STACK_SIZE]);
+
+       run_guest(vmcb, svm->vmcb_gpa);
+
+       GUEST_ASSERT(vmcb->control.exit_code == SVM_EXIT_VMMCALL);
+       GUEST_DONE();
+}
+
+int main(int argc, char *argv[])
+{
+       vm_vaddr_t svm_gva;
+
+       nested_svm_check_supported();
+
+       vm = vm_create_default(VCPU_ID, 0, (void *) l1_guest_code);
+       vcpu_set_cpuid(vm, VCPU_ID, kvm_get_supported_cpuid());
+
+       vcpu_alloc_svm(vm, &svm_gva);
+       vcpu_args_set(vm, VCPU_ID, 1, svm_gva);
+
+       for (;;) {
+               volatile struct kvm_run *run = vcpu_state(vm, VCPU_ID);
+               struct ucall uc;
+
+               vcpu_run(vm, VCPU_ID);
+               TEST_ASSERT(run->exit_reason == KVM_EXIT_IO,
+                           "Got exit_reason other than KVM_EXIT_IO: %u (%s)\n",
+                           run->exit_reason,
+                           exit_reason_str(run->exit_reason));
+
+               switch (get_ucall(vm, VCPU_ID, &uc)) {
+               case UCALL_ABORT:
+                       TEST_ASSERT(false, "%s",
+                                   (const char *)uc.args[0]);
+                       /* NOT REACHED */
+               case UCALL_SYNC:
+                       break;
+               case UCALL_DONE:
+                       goto done;
+               default:
+                       TEST_ASSERT(false,
+                                   "Unknown ucall 0x%x.", uc.cmd);
+               }
+       }
+done:
+       kvm_vm_free(vm);
+       return 0;
+}
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk

index 1c8a1963d03f8acf349ae209e5db03ab79e2614e..3ed0134a764d494f6b451fddfa45fee58f11611a 100644 (file)
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -83,17 +83,20 @@ else
         $(call RUN_TESTS, $(TEST_GEN_PROGS) $(TEST_CUSTOM_PROGS) $(TEST_PROGS))
  endif
  
+define INSTALL_SINGLE_RULE
+       $(if $(INSTALL_LIST),@mkdir -p $(INSTALL_PATH))
+       $(if $(INSTALL_LIST),@echo rsync -a $(INSTALL_LIST) $(INSTALL_PATH)/)
+       $(if $(INSTALL_LIST),@rsync -a $(INSTALL_LIST) $(INSTALL_PATH)/)
+endef
+
  define INSTALL_RULE
-       @if [ "X$(TEST_PROGS)$(TEST_PROGS_EXTENDED)$(TEST_FILES)" != "X" ]; then                                        \
-               mkdir -p ${INSTALL_PATH};                                                                               \
-               echo "rsync -a $(TEST_PROGS) $(TEST_PROGS_EXTENDED) $(TEST_FILES) $(INSTALL_PATH)/";    \
-               rsync -a $(TEST_PROGS) $(TEST_PROGS_EXTENDED) $(TEST_FILES) $(INSTALL_PATH)/;           \
-       fi
-       @if [ "X$(TEST_GEN_PROGS)$(TEST_CUSTOM_PROGS)$(TEST_GEN_PROGS_EXTENDED)$(TEST_GEN_FILES)" != "X" ]; then                                        \
-               mkdir -p ${INSTALL_PATH};                                                                               \
-               echo "rsync -a $(TEST_GEN_PROGS) $(TEST_CUSTOM_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES) $(INSTALL_PATH)/";   \
-               rsync -a $(TEST_GEN_PROGS) $(TEST_CUSTOM_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES) $(INSTALL_PATH)/;          \
-       fi
+       $(eval INSTALL_LIST = $(TEST_PROGS)) $(INSTALL_SINGLE_RULE)
+       $(eval INSTALL_LIST = $(TEST_PROGS_EXTENDED)) $(INSTALL_SINGLE_RULE)
+       $(eval INSTALL_LIST = $(TEST_FILES)) $(INSTALL_SINGLE_RULE)
+       $(eval INSTALL_LIST = $(TEST_GEN_PROGS)) $(INSTALL_SINGLE_RULE)
+       $(eval INSTALL_LIST = $(TEST_CUSTOM_PROGS)) $(INSTALL_SINGLE_RULE)
+       $(eval INSTALL_LIST = $(TEST_GEN_PROGS_EXTENDED)) $(INSTALL_SINGLE_RULE)
+       $(eval INSTALL_LIST = $(TEST_GEN_FILES)) $(INSTALL_SINGLE_RULE)
  endef
  
  install: all
diff --git a/tools/testing/selftests/livepatch/Makefile b/tools/testing/selftests/livepatch/Makefile

index 3876d8d62494443297b0da55cb88bbe5f2da5210..1acc9e1fa3fbca78db5fca858f01c042a2fe62ff 100644 (file)
--- a/tools/testing/selftests/livepatch/Makefile
+++ b/tools/testing/selftests/livepatch/Makefile
@@ -8,4 +8,6 @@ TEST_PROGS := \
         test-state.sh \
         test-ftrace.sh
  
+TEST_FILES := settings
+
  include ../lib.mk
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile

index b5694196430ae2334b83cdeb465bf3e7d36080f5..287ae916ec0b4b8e22b886ec65ac0c21fdedc72a 100644 (file)
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -27,5 +27,5 @@ KSFT_KHDR_INSTALL := 1
  include ../lib.mk
  
  $(OUTPUT)/reuseport_bpf_numa: LDLIBS += -lnuma
-$(OUTPUT)/tcp_mmap: LDFLAGS += -lpthread
-$(OUTPUT)/tcp_inq: LDFLAGS += -lpthread
+$(OUTPUT)/tcp_mmap: LDLIBS += -lpthread
+$(OUTPUT)/tcp_inq: LDLIBS += -lpthread
diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh

index 6dd4031038008bef871a66ea1e3ebfdf7d8b964f..60273f1bc7d9c0cfe8324d8bd4b256cc88ad2fff 100755 (executable)
--- a/tools/testing/selftests/net/fib_tests.sh
+++ b/tools/testing/selftests/net/fib_tests.sh
@@ -910,6 +910,12 @@ ipv6_rt_replace_mpath()
         check_route6 "2001:db8:104::/64 via 2001:db8:101::3 dev veth1 metric 1024"
         log_test $? 0 "Multipath with single path via multipath attribute"
  
+       # multipath with dev-only
+       add_initial_route6 "nexthop via 2001:db8:101::2 nexthop via 2001:db8:103::2"
+       run_cmd "$IP -6 ro replace 2001:db8:104::/64 dev veth1"
+       check_route6 "2001:db8:104::/64 dev veth1 metric 1024"
+       log_test $? 0 "Multipath with dev-only"
+
         # route replace fails - invalid nexthop 1
         add_initial_route6 "nexthop via 2001:db8:101::2 nexthop via 2001:db8:103::2"
         run_cmd "$IP -6 ro replace 2001:db8:104::/64 nexthop via 2001:db8:111::3 nexthop via 2001:db8:103::3"
diff --git a/tools/testing/selftests/net/forwarding/mirror_gre.sh b/tools/testing/selftests/net/forwarding/mirror_gre.sh

index e6fd7a18c655ff32e4f4d5ffd033acd866a70c96..0266443601bc0bb90504132a1c1b57aced95089b 100755 (executable)
--- a/tools/testing/selftests/net/forwarding/mirror_gre.sh
+++ b/tools/testing/selftests/net/forwarding/mirror_gre.sh
@@ -63,22 +63,23 @@ test_span_gre_mac()
  {
         local tundev=$1; shift
         local direction=$1; shift
-       local prot=$1; shift
         local what=$1; shift
  
-       local swp3mac=$(mac_get $swp3)
-       local h3mac=$(mac_get $h3)
+       case "$direction" in
+       ingress) local src_mac=$(mac_get $h1); local dst_mac=$(mac_get $h2)
+               ;;
+       egress) local src_mac=$(mac_get $h2); local dst_mac=$(mac_get $h1)
+               ;;
+       esac
  
         RET=0
  
         mirror_install $swp1 $direction $tundev "matchall $tcflags"
-       tc filter add dev $h3 ingress pref 77 prot $prot \
-               flower ip_proto 0x2f src_mac $swp3mac dst_mac $h3mac \
-               action pass
+       icmp_capture_install h3-${tundev} "src_mac $src_mac dst_mac $dst_mac"
  
-       mirror_test v$h1 192.0.2.1 192.0.2.2 $h3 77 10
+       mirror_test v$h1 192.0.2.1 192.0.2.2 h3-${tundev} 100 10
  
-       tc filter del dev $h3 ingress pref 77
+       icmp_capture_uninstall h3-${tundev}
         mirror_uninstall $swp1 $direction
  
         log_test "$direction $what: envelope MAC ($tcflags)"
@@ -120,14 +121,14 @@ test_ip6gretap()
  
  test_gretap_mac()
  {
-       test_span_gre_mac gt4 ingress ip "mirror to gretap"
-       test_span_gre_mac gt4 egress ip "mirror to gretap"
+       test_span_gre_mac gt4 ingress "mirror to gretap"
+       test_span_gre_mac gt4 egress "mirror to gretap"
  }
  
  test_ip6gretap_mac()
  {
-       test_span_gre_mac gt6 ingress ipv6 "mirror to ip6gretap"
-       test_span_gre_mac gt6 egress ipv6 "mirror to ip6gretap"
+       test_span_gre_mac gt6 ingress "mirror to ip6gretap"
+       test_span_gre_mac gt6 egress "mirror to ip6gretap"
  }
  
  test_all()
diff --git a/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh b/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh

index bb10e33690b25a763d3de18cce207a098baf7351..ce6bea9675c074587ca1ebc51a3402cdcf782ee0 100755 (executable)
--- a/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh
+++ b/tools/testing/selftests/net/forwarding/vxlan_bridge_1d.sh
@@ -516,9 +516,9 @@ test_tos()
         RET=0
  
         tc filter add dev v1 egress pref 77 prot ip \
-               flower ip_tos 0x40 action pass
-       vxlan_ping_test $h1 192.0.2.3 "-Q 0x40" v1 egress 77 10
-       vxlan_ping_test $h1 192.0.2.3 "-Q 0x30" v1 egress 77 0
+               flower ip_tos 0x14 action pass
+       vxlan_ping_test $h1 192.0.2.3 "-Q 0x14" v1 egress 77 10
+       vxlan_ping_test $h1 192.0.2.3 "-Q 0x18" v1 egress 77 0
         tc filter del dev v1 egress pref 77 prot ip
  
         log_test "VXLAN: envelope TOS inheritance"
diff --git a/tools/testing/selftests/net/mptcp/Makefile b/tools/testing/selftests/net/mptcp/Makefile

index 93de52016ddee400e64ae3ed9eceac8a3b10834b..ba450e62dc5be5e5bccd5f36753ca81c1a74dff7 100644 (file)
--- a/tools/testing/selftests/net/mptcp/Makefile
+++ b/tools/testing/selftests/net/mptcp/Makefile
@@ -8,6 +8,8 @@ TEST_PROGS := mptcp_connect.sh
  
  TEST_GEN_FILES = mptcp_connect
  
+TEST_FILES := settings
+
  EXTRA_CLEAN := *.pcap
  
  include ../../lib.mk
diff --git a/tools/testing/selftests/netfilter/nft_concat_range.sh b/tools/testing/selftests/netfilter/nft_concat_range.sh

index aca21dde102abbb05294594d8b3f7483873b87f5..5a4938d6dcf25a3f8137be799091f4f0d16ad7fd 100755 (executable)
--- a/tools/testing/selftests/netfilter/nft_concat_range.sh
+++ b/tools/testing/selftests/netfilter/nft_concat_range.sh
@@ -13,11 +13,12 @@
  KSELFTEST_SKIP=4
  
  # Available test groups:
+# - reported_issues: check for issues that were reported in the past
  # - correctness: check that packets match given entries, and only those
  # - concurrency: attempt races between insertion, deletion and lookup
  # - timeout: check that packets match entries until they expire
  # - performance: estimate matching rate, compare with rbtree and hash baselines
-TESTS="correctness concurrency timeout"
+TESTS="reported_issues correctness concurrency timeout"
  [ "${quicktest}" != "1" ] && TESTS="${TESTS} performance"
  
  # Set types, defined by TYPE_ variables below
@@ -25,6 +26,9 @@ TYPES="net_port port_net net6_port port_proto net6_port_mac net6_port_mac_proto
         net_port_net net_mac net_mac_icmp net6_mac_icmp net6_port_net6_port
         net_port_mac_proto_net"
  
+# Reported bugs, also described by TYPE_ variables below
+BUGS="flush_remove_add"
+
  # List of possible paths to pktgen script from kernel tree for performance tests
  PKTGEN_SCRIPT_PATHS="
         ../../../samples/pktgen/pktgen_bench_xmit_mode_netif_receive.sh
@@ -327,6 +331,12 @@ flood_spec ip daddr . tcp dport . meta l4proto . ip saddr
  perf_duration  0
  "
  
+# Definition of tests for bugs reported in the past:
+# display      display text for test report
+TYPE_flush_remove_add="
+display                Add two elements, flush, re-add
+"
+
  # Set template for all tests, types and rules are filled in depending on test
  set_template='
  flush ruleset
@@ -440,6 +450,8 @@ setup_set() {
  
  # Check that at least one of the needed tools is available
  check_tools() {
+       [ -z "${tools}" ] && return 0
+
         __tools=
         for tool in ${tools}; do
                 if [ "${tool}" = "nc" ] && [ "${proto}" = "udp6" ] && \
@@ -1025,7 +1037,7 @@ format_noconcat() {
  add() {
         if ! nft add element inet filter test "${1}"; then
                 err "Failed to add ${1} given ruleset:"
-               err "$(nft list ruleset -a)"
+               err "$(nft -a list ruleset)"
                 return 1
         fi
  }
@@ -1045,7 +1057,7 @@ add_perf() {
  add_perf_norange() {
         if ! nft add element netdev perf norange "${1}"; then
                 err "Failed to add ${1} given ruleset:"
-               err "$(nft list ruleset -a)"
+               err "$(nft -a list ruleset)"
                 return 1
         fi
  }
@@ -1054,7 +1066,7 @@ add_perf_norange() {
  add_perf_noconcat() {
         if ! nft add element netdev perf noconcat "${1}"; then
                 err "Failed to add ${1} given ruleset:"
-               err "$(nft list ruleset -a)"
+               err "$(nft -a list ruleset)"
                 return 1
         fi
  }
@@ -1063,7 +1075,7 @@ add_perf_noconcat() {
  del() {
         if ! nft delete element inet filter test "${1}"; then
                 err "Failed to delete ${1} given ruleset:"
-               err "$(nft list ruleset -a)"
+               err "$(nft -a list ruleset)"
                 return 1
         fi
  }
@@ -1134,7 +1146,7 @@ send_match() {
                 err "  $(for f in ${src}; do
                          eval format_\$f "${2}"; printf ' '; done)"
                 err "should have matched ruleset:"
-               err "$(nft list ruleset -a)"
+               err "$(nft -a list ruleset)"
                 return 1
         fi
         nft reset counter inet filter test >/dev/null
@@ -1160,7 +1172,7 @@ send_nomatch() {
                 err "  $(for f in ${src}; do
                          eval format_\$f "${2}"; printf ' '; done)"
                 err "should not have matched ruleset:"
-               err "$(nft list ruleset -a)"
+               err "$(nft -a list ruleset)"
                 return 1
         fi
  }
@@ -1430,6 +1442,23 @@ test_performance() {
         kill "${perf_pid}"
  }
  
+test_bug_flush_remove_add() {
+       set_cmd='{ set s { type ipv4_addr . inet_service; flags interval; }; }'
+       elem1='{ 10.0.0.1 . 22-25, 10.0.0.1 . 10-20 }'
+       elem2='{ 10.0.0.1 . 10-20, 10.0.0.1 . 22-25 }'
+       for i in `seq 1 100`; do
+               nft add table t ${set_cmd}      || return ${KSELFTEST_SKIP}
+               nft add element t s ${elem1}    2>/dev/null || return 1
+               nft flush set t s               2>/dev/null || return 1
+               nft add element t s ${elem2}    2>/dev/null || return 1
+       done
+       nft flush ruleset
+}
+
+test_reported_issues() {
+       eval test_bug_"${subtest}"
+}
+
  # Run everything in a separate network namespace
  [ "${1}" != "run" ] && { unshare -n "${0}" run; exit $?; }
  tmp="$(mktemp)"
@@ -1438,9 +1467,15 @@ trap cleanup EXIT
  # Entry point for test runs
  passed=0
  for name in ${TESTS}; do
-       printf "TEST: %s\n" "${name}"
-       for type in ${TYPES}; do
-               eval desc=\$TYPE_"${type}"
+       printf "TEST: %s\n" "$(echo ${name} | tr '_' ' ')"
+       if [ "${name}" = "reported_issues" ]; then
+               SUBTESTS="${BUGS}"
+       else
+               SUBTESTS="${TYPES}"
+       fi
+
+       for subtest in ${SUBTESTS}; do
+               eval desc=\$TYPE_"${subtest}"
                 IFS='
  '
                 for __line in ${desc}; do
diff --git a/tools/testing/selftests/openat2/helpers.c b/tools/testing/selftests/openat2/helpers.c

index e9a6557ab16f3b347d98d84d80d008f2488c7bdd..5074681ffdc995dbe3cc903550ef63f3385ba25a 100644 (file)
--- a/tools/testing/selftests/openat2/helpers.c
+++ b/tools/testing/selftests/openat2/helpers.c
@@ -46,7 +46,7 @@ int sys_renameat2(int olddirfd, const char *oldpath,
  
  int touchat(int dfd, const char *path)
  {
-       int fd = openat(dfd, path, O_CREAT);
+       int fd = openat(dfd, path, O_CREAT, 0700);
         if (fd >= 0)
                 close(fd);
         return fd;
diff --git a/tools/testing/selftests/openat2/resolve_test.c b/tools/testing/selftests/openat2/resolve_test.c

index 7a94b1da8e7bcf2a8413bad5abb3a2cb2de6ed8d..bbafad440893cba427168d42fe3e1620a8dc00e5 100644 (file)
--- a/tools/testing/selftests/openat2/resolve_test.c
+++ b/tools/testing/selftests/openat2/resolve_test.c
@@ -230,7 +230,7 @@ void test_openat2_opath_tests(void)
                 { .name = "[in_root] garbage link to /root",
                   .path = "cheeky/garbageself", .how.resolve = RESOLVE_IN_ROOT,
                   .out.path = "root",           .pass = true },
-               { .name = "[in_root] chainged garbage links to /root",
+               { .name = "[in_root] chained garbage links to /root",
                   .path = "abscheeky/garbageself", .how.resolve = RESOLVE_IN_ROOT,
                   .out.path = "root",           .pass = true },
                 { .name = "[in_root] relative path to 'root'",
diff --git a/tools/testing/selftests/rseq/Makefile b/tools/testing/selftests/rseq/Makefile

index d6469535630af888051a9c50e8059a8a2abfac8a..2af9d39a97168c14dc4c008d59017d45146ff5bf 100644 (file)
--- a/tools/testing/selftests/rseq/Makefile
+++ b/tools/testing/selftests/rseq/Makefile
@@ -4,7 +4,7 @@ ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),)
  CLANG_FLAGS += -no-integrated-as
  endif
  
-CFLAGS += -O2 -Wall -g -I./ -I../../../../usr/include/ -L./ -Wl,-rpath=./ \
+CFLAGS += -O2 -Wall -g -I./ -I../../../../usr/include/ -L$(OUTPUT) -Wl,-rpath=./ \
           $(CLANG_FLAGS)
  LDLIBS += -lpthread
  
@@ -19,6 +19,8 @@ TEST_GEN_PROGS_EXTENDED = librseq.so
  
  TEST_PROGS = run_param_test.sh
  
+TEST_FILES := settings
+
  include ../lib.mk
  
  $(OUTPUT)/librseq.so: rseq.c rseq.h rseq-*.h
diff --git a/tools/testing/selftests/rtc/Makefile b/tools/testing/selftests/rtc/Makefile

index de9c8566672ae7e4bd7d6d7c321455ff5525b5a2..55198ecc04dbea93c4fb358ebac715527e2755d4 100644 (file)
--- a/tools/testing/selftests/rtc/Makefile
+++ b/tools/testing/selftests/rtc/Makefile
@@ -1,9 +1,11 @@
  # SPDX-License-Identifier: GPL-2.0
  CFLAGS += -O3 -Wl,-no-as-needed -Wall
-LDFLAGS += -lrt -lpthread -lm
+LDLIBS += -lrt -lpthread -lm
  
  TEST_GEN_PROGS = rtctest
  
  TEST_GEN_PROGS_EXTENDED = setdate
  
+TEST_FILES := settings
+
  include ../lib.mk
diff --git a/tools/testing/selftests/timens/Makefile b/tools/testing/selftests/timens/Makefile

index e9fb30bd8aeb3c0ed846ccdd2aeaf18cb6723592..b4fd9a9346547c71e8331d0ccb165ada9200d15e 100644 (file)
--- a/tools/testing/selftests/timens/Makefile
+++ b/tools/testing/selftests/timens/Makefile
@@ -2,6 +2,6 @@ TEST_GEN_PROGS := timens timerfd timer clock_nanosleep procfs exec
  TEST_GEN_PROGS_EXTENDED := gettime_perf
  
  CFLAGS := -Wall -Werror -pthread
-LDFLAGS := -lrt -ldl
+LDLIBS := -lrt -ldl
  
  include ../lib.mk
diff --git a/tools/testing/selftests/tpm2/test_smoke.sh b/tools/testing/selftests/tpm2/test_smoke.sh

index 8155c2ea7ccbb6ed1b9685f3c602105eb2c26173..b630c7b5950a960b95a9bb0aeda79e41924ca989 100755 (executable)
--- a/tools/testing/selftests/tpm2/test_smoke.sh
+++ b/tools/testing/selftests/tpm2/test_smoke.sh
@@ -1,8 +1,17 @@
  #!/bin/bash
  # SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
+self.flags = flags
  
-python -m unittest -v tpm2_tests.SmokeTest
-python -m unittest -v tpm2_tests.AsyncTest
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+
+
+if [ -f /dev/tpm0 ] ; then
+       python -m unittest -v tpm2_tests.SmokeTest
+       python -m unittest -v tpm2_tests.AsyncTest
+else
+       exit $ksft_skip
+fi
  
  CLEAR_CMD=$(which tpm2_clear)
  if [ -n $CLEAR_CMD ]; then
diff --git a/tools/testing/selftests/tpm2/test_space.sh b/tools/testing/selftests/tpm2/test_space.sh

index a6f5e346635e560db7c7767b22f32f760922600a..180b469c53b47d4e0e2e3d7c6bb2ecc7331065e2 100755 (executable)
--- a/tools/testing/selftests/tpm2/test_space.sh
+++ b/tools/testing/selftests/tpm2/test_space.sh
@@ -1,4 +1,11 @@
  #!/bin/bash
  # SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
  
-python -m unittest -v tpm2_tests.SpaceTest
+# Kselftest framework requirement - SKIP code is 4.
+ksft_skip=4
+
+if [ -f /dev/tpmrm0 ] ; then
+       python -m unittest -v tpm2_tests.SpaceTest
+else
+       exit $ksft_skip
+fi
diff --git a/tools/testing/selftests/vm/run_vmtests b/tools/testing/selftests/vm/run_vmtests

index a692ea8283177dc8716ae6af24c2b7684471e4c4..f337148431980b821b058a87fb73bd3679d0b316 100755 (executable)
--- a/tools/testing/selftests/vm/run_vmtests
+++ b/tools/testing/selftests/vm/run_vmtests
@@ -112,6 +112,17 @@ echo "NOTE: The above hugetlb tests provide minimal coverage.  Use"
  echo "      https://github.com/libhugetlbfs/libhugetlbfs.git for"
  echo "      hugetlb regression testing."
  
+echo "---------------------------"
+echo "running map_fixed_noreplace"
+echo "---------------------------"
+./map_fixed_noreplace
+if [ $? -ne 0 ]; then
+       echo "[FAIL]"
+       exitcode=1
+else
+       echo "[PASS]"
+fi
+
  echo "-------------------"
  echo "running userfaultfd"
  echo "-------------------"
@@ -186,6 +197,17 @@ else
         echo "[PASS]"
  fi
  
+echo "-------------------------"
+echo "running mlock-random-test"
+echo "-------------------------"
+./mlock-random-test
+if [ $? -ne 0 ]; then
+       echo "[FAIL]"
+       exitcode=1
+else
+       echo "[PASS]"
+fi
+
  echo "--------------------"
  echo "running mlock2-tests"
  echo "--------------------"
@@ -197,6 +219,17 @@ else
         echo "[PASS]"
  fi
  
+echo "-----------------"
+echo "running thuge-gen"
+echo "-----------------"
+./thuge-gen
+if [ $? -ne 0 ]; then
+       echo "[FAIL]"
+       exitcode=1
+else
+       echo "[PASS]"
+fi
+
  if [ $VADDR64 -ne 0 ]; then
  echo "-----------------------------"
  echo "running virtual_address_range"
diff --git a/tools/testing/selftests/wireguard/netns.sh b/tools/testing/selftests/wireguard/netns.sh

index f5ab1cda8bb55c49ae30ae91919b33922d5f240e..138d46b3f3306bce463da910cd2f37066655f24b 100755 (executable)
--- a/tools/testing/selftests/wireguard/netns.sh
+++ b/tools/testing/selftests/wireguard/netns.sh
@@ -24,6 +24,7 @@
  set -e
  
  exec 3>&1
+export LANG=C
  export WG_HIDE_KEYS=never
  netns0="wg-test-$$-0"
  netns1="wg-test-$$-1"
@@ -297,7 +298,17 @@ ip1 -4 rule add table main suppress_prefixlength 0
  n1 ping -W 1 -c 100 -f 192.168.99.7
  n1 ping -W 1 -c 100 -f abab::1111
  
+# Have ns2 NAT into wg0 packets from ns0, but return an icmp error along the right route.
+n2 iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -d 192.168.241.0/24 -j SNAT --to 192.168.241.2
+n0 iptables -t filter -A INPUT \! -s 10.0.0.0/24 -i vethrs -j DROP # Manual rpfilter just to be explicit.
+n2 bash -c 'printf 1 > /proc/sys/net/ipv4/ip_forward'
+ip0 -4 route add 192.168.241.1 via 10.0.0.100
+n2 wg set wg0 peer "$pub1" remove
+[[ $(! n0 ping -W 1 -c 1 192.168.241.1 || false) == *"From 10.0.0.100 icmp_seq=1 Destination Host Unreachable"* ]]
+
  n0 iptables -t nat -F
+n0 iptables -t filter -F
+n2 iptables -t nat -F
  ip0 link del vethrc
  ip0 link del vethrs
  ip1 link del wg0
diff --git a/tools/testing/selftests/wireguard/qemu/Makefile b/tools/testing/selftests/wireguard/qemu/Makefile

index f10aa3590adc425b4d564eb91adfe4a77250e1bd..28d477683e8abe3c492274f7a71c8b48f9bda00e 100644 (file)
--- a/tools/testing/selftests/wireguard/qemu/Makefile
+++ b/tools/testing/selftests/wireguard/qemu/Makefile
@@ -38,19 +38,17 @@ endef
  define file_download =
  $(DISTFILES_PATH)/$(1):
         mkdir -p $(DISTFILES_PATH)
-       flock -x $$@.lock -c '[ -f $$@ ] && exit 0; wget -O $$@.tmp $(MIRROR)$(1) || wget -O $$@.tmp $(2)$(1) || rm -f $$@.tmp'
-       if echo "$(3)  $$@.tmp" | sha256sum -c -; then mv $$@.tmp $$@; else rm -f $$@.tmp; exit 71; fi
+       flock -x $$@.lock -c '[ -f $$@ ] && exit 0; wget -O $$@.tmp $(MIRROR)$(1) || wget -O $$@.tmp $(2)$(1) || rm -f $$@.tmp; [ -f $$@.tmp ] || exit 1; if echo "$(3)  $$@.tmp" | sha256sum -c -; then mv $$@.tmp $$@; else rm -f $$@.tmp; exit 71; fi'
  endef
  
  $(eval $(call tar_download,MUSL,musl,1.1.24,.tar.gz,https://www.musl-libc.org/releases/,1370c9a812b2cf2a7d92802510cca0058cc37e66a7bedd70051f0a34015022a3))
-$(eval $(call tar_download,LIBMNL,libmnl,1.0.4,.tar.bz2,https://www.netfilter.org/projects/libmnl/files/,171f89699f286a5854b72b91d06e8f8e3683064c5901fb09d954a9ab6f551f81))
  $(eval $(call tar_download,IPERF,iperf,3.7,.tar.gz,https://downloads.es.net/pub/iperf/,d846040224317caf2f75c843d309a950a7db23f9b44b94688ccbe557d6d1710c))
  $(eval $(call tar_download,BASH,bash,5.0,.tar.gz,https://ftp.gnu.org/gnu/bash/,b4a80f2ac66170b2913efbfb9f2594f1f76c7b1afd11f799e22035d63077fb4d))
  $(eval $(call tar_download,IPROUTE2,iproute2,5.4.0,.tar.xz,https://www.kernel.org/pub/linux/utils/net/iproute2/,fe97aa60a0d4c5ac830be18937e18dc3400ca713a33a89ad896ff1e3d46086ae))
  $(eval $(call tar_download,IPTABLES,iptables,1.8.4,.tar.bz2,https://www.netfilter.org/projects/iptables/files/,993a3a5490a544c2cbf2ef15cf7e7ed21af1845baf228318d5c36ef8827e157c))
  $(eval $(call tar_download,NMAP,nmap,7.80,.tar.bz2,https://nmap.org/dist/,fcfa5a0e42099e12e4bf7a68ebe6fde05553383a682e816a7ec9256ab4773faa))
  $(eval $(call tar_download,IPUTILS,iputils,s20190709,.tar.gz,https://github.com/iputils/iputils/archive/s20190709.tar.gz/#,a15720dd741d7538dd2645f9f516d193636ae4300ff7dbc8bfca757bf166490a))
-$(eval $(call tar_download,WIREGUARD_TOOLS,wireguard-tools,1.0.20191226,.tar.xz,https://git.zx2c4.com/wireguard-tools/snapshot/,aa8af0fdc9872d369d8c890a84dbc2a2466b55795dccd5b47721b2d97644b04f))
+$(eval $(call tar_download,WIREGUARD_TOOLS,wireguard-tools,1.0.20200206,.tar.xz,https://git.zx2c4.com/wireguard-tools/snapshot/,f5207248c6a3c3e3bfc9ab30b91c1897b00802ed861e1f9faaed873366078c64))
  
  KERNEL_BUILD_PATH := $(BUILD_PATH)/kernel$(if $(findstring yes,$(DEBUG_KERNEL)),-debug)
  rwildcard=$(foreach d,$(wildcard $1*),$(call rwildcard,$d/,$2) $(filter $(subst *,%,$2),$d))
@@ -295,21 +293,13 @@ $(IPERF_PATH)/src/iperf3: | $(IPERF_PATH)/.installed $(USERSPACE_DEPS)
         $(MAKE) -C $(IPERF_PATH)
         $(STRIP) -s $@
  
-$(LIBMNL_PATH)/.installed: $(LIBMNL_TAR)
-       flock -s $<.lock tar -C $(BUILD_PATH) -xf $<
-       touch $@
-
-$(LIBMNL_PATH)/src/.libs/libmnl.a: | $(LIBMNL_PATH)/.installed $(USERSPACE_DEPS)
-       cd $(LIBMNL_PATH) && ./configure --prefix=/ $(CROSS_COMPILE_FLAG) --enable-static --disable-shared
-       $(MAKE) -C $(LIBMNL_PATH)
-       sed -i 's:prefix=.*:prefix=$(LIBMNL_PATH):' $(LIBMNL_PATH)/libmnl.pc
-
  $(WIREGUARD_TOOLS_PATH)/.installed: $(WIREGUARD_TOOLS_TAR)
+       mkdir -p $(BUILD_PATH)
         flock -s $<.lock tar -C $(BUILD_PATH) -xf $<
         touch $@
  
-$(WIREGUARD_TOOLS_PATH)/src/wg: | $(WIREGUARD_TOOLS_PATH)/.installed $(LIBMNL_PATH)/src/.libs/libmnl.a $(USERSPACE_DEPS)
-       LDFLAGS="$(LDFLAGS) -L$(LIBMNL_PATH)/src/.libs" $(MAKE) -C $(WIREGUARD_TOOLS_PATH)/src LIBMNL_CFLAGS="-I$(LIBMNL_PATH)/include" LIBMNL_LDLIBS="-lmnl" wg
+$(WIREGUARD_TOOLS_PATH)/src/wg: | $(WIREGUARD_TOOLS_PATH)/.installed $(USERSPACE_DEPS)
+       $(MAKE) -C $(WIREGUARD_TOOLS_PATH)/src wg
         $(STRIP) -s $@
  
  $(BUILD_PATH)/init: init.c | $(USERSPACE_DEPS)
@@ -340,17 +330,17 @@ $(BASH_PATH)/bash: | $(BASH_PATH)/.installed $(USERSPACE_DEPS)
  $(IPROUTE2_PATH)/.installed: $(IPROUTE2_TAR)
         mkdir -p $(BUILD_PATH)
         flock -s $<.lock tar -C $(BUILD_PATH) -xf $<
-       printf 'CC:=$(CC)\nPKG_CONFIG:=pkg-config\nTC_CONFIG_XT:=n\nTC_CONFIG_ATM:=n\nTC_CONFIG_IPSET:=n\nIP_CONFIG_SETNS:=y\nHAVE_ELF:=n\nHAVE_MNL:=y\nHAVE_BERKELEY_DB:=n\nHAVE_LATEX:=n\nHAVE_PDFLATEX:=n\nCFLAGS+=-DHAVE_SETNS -DHAVE_LIBMNL -I$(LIBMNL_PATH)/include\nLDLIBS+=-lmnl' > $(IPROUTE2_PATH)/config.mk
+       printf 'CC:=$(CC)\nPKG_CONFIG:=pkg-config\nTC_CONFIG_XT:=n\nTC_CONFIG_ATM:=n\nTC_CONFIG_IPSET:=n\nIP_CONFIG_SETNS:=y\nHAVE_ELF:=n\nHAVE_MNL:=n\nHAVE_BERKELEY_DB:=n\nHAVE_LATEX:=n\nHAVE_PDFLATEX:=n\nCFLAGS+=-DHAVE_SETNS\n' > $(IPROUTE2_PATH)/config.mk
         printf 'lib: snapshot\n\t$$(MAKE) -C lib\nip/ip: lib\n\t$$(MAKE) -C ip ip\nmisc/ss: lib\n\t$$(MAKE) -C misc ss\n' >> $(IPROUTE2_PATH)/Makefile
         touch $@
  
-$(IPROUTE2_PATH)/ip/ip: | $(IPROUTE2_PATH)/.installed $(LIBMNL_PATH)/src/.libs/libmnl.a $(USERSPACE_DEPS)
-       LDFLAGS="$(LDFLAGS) -L$(LIBMNL_PATH)/src/.libs" PKG_CONFIG_LIBDIR="$(LIBMNL_PATH)" $(MAKE) -C $(IPROUTE2_PATH) PREFIX=/ ip/ip
-       $(STRIP) -s $(IPROUTE2_PATH)/ip/ip
+$(IPROUTE2_PATH)/ip/ip: | $(IPROUTE2_PATH)/.installed $(USERSPACE_DEPS)
+       $(MAKE) -C $(IPROUTE2_PATH) PREFIX=/ ip/ip
+       $(STRIP) -s $@
  
-$(IPROUTE2_PATH)/misc/ss: | $(IPROUTE2_PATH)/.installed $(LIBMNL_PATH)/src/.libs/libmnl.a $(USERSPACE_DEPS)
-       LDFLAGS="$(LDFLAGS) -L$(LIBMNL_PATH)/src/.libs" PKG_CONFIG_LIBDIR="$(LIBMNL_PATH)" $(MAKE) -C $(IPROUTE2_PATH) PREFIX=/ misc/ss
-       $(STRIP) -s $(IPROUTE2_PATH)/misc/ss
+$(IPROUTE2_PATH)/misc/ss: | $(IPROUTE2_PATH)/.installed $(USERSPACE_DEPS)
+       $(MAKE) -C $(IPROUTE2_PATH) PREFIX=/ misc/ss
+       $(STRIP) -s $@
  
  $(IPTABLES_PATH)/.installed: $(IPTABLES_TAR)
         mkdir -p $(BUILD_PATH)
@@ -358,8 +348,8 @@ $(IPTABLES_PATH)/.installed: $(IPTABLES_TAR)
         sed -i -e "/nfnetlink=[01]/s:=[01]:=0:" -e "/nfconntrack=[01]/s:=[01]:=0:" $(IPTABLES_PATH)/configure
         touch $@
  
-$(IPTABLES_PATH)/iptables/xtables-legacy-multi: | $(IPTABLES_PATH)/.installed $(LIBMNL_PATH)/src/.libs/libmnl.a $(USERSPACE_DEPS)
-       cd $(IPTABLES_PATH) && PKG_CONFIG_LIBDIR="$(LIBMNL_PATH)" ./configure --prefix=/ $(CROSS_COMPILE_FLAG) --enable-static --disable-shared --disable-nftables --disable-bpf-compiler --disable-nfsynproxy --disable-libipq --with-kernel=$(BUILD_PATH)/include
+$(IPTABLES_PATH)/iptables/xtables-legacy-multi: | $(IPTABLES_PATH)/.installed $(USERSPACE_DEPS)
+       cd $(IPTABLES_PATH) && ./configure --prefix=/ $(CROSS_COMPILE_FLAG) --enable-static --disable-shared --disable-nftables --disable-bpf-compiler --disable-nfsynproxy --disable-libipq --disable-connlabel --with-kernel=$(BUILD_PATH)/include
         $(MAKE) -C $(IPTABLES_PATH)
         $(STRIP) -s $@
  
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c

index d65a0faa46d8977e4ecfa3e36a722bb506ab3924..eda7b624eab8c46250789267a2118c50affab453 100644 (file)
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -742,9 +742,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
                 guest_enter_irqoff();
  
                 if (has_vhe()) {
-                       kvm_arm_vhe_guest_enter();
                         ret = kvm_vcpu_run_vhe(vcpu);
-                       kvm_arm_vhe_guest_exit();
                 } else {
                         ret = kvm_call_hyp_ret(__kvm_vcpu_run_nvhe, vcpu);
                 }
diff --git a/virt/kvm/arm/trace.h b/virt/kvm/arm/trace.h

index 204d210d01c29a3282e16ec7c6ed21d25dc315c3..cc94ccc688217c7fa337e01d74ca1378f31699c1 100644 (file)
--- a/virt/kvm/arm/trace.h
+++ b/virt/kvm/arm/trace.h
@@ -4,6 +4,7 @@
  
  #include <kvm/arm_arch_timer.h>
  #include <linux/tracepoint.h>
+#include <asm/kvm_arm.h>
  
  #undef TRACE_SYSTEM
  #define TRACE_SYSTEM kvm
diff --git a/virt/kvm/arm/vgic/vgic-mmio.c b/virt/kvm/arm/vgic/vgic-mmio.c

index d656ebd5f9d4614deab7668c2c091c8ab644c038..97fb2a40e6ba193efc51fa1cf0c05bb0815db648 100644 (file)
--- a/virt/kvm/arm/vgic/vgic-mmio.c
+++ b/virt/kvm/arm/vgic/vgic-mmio.c
@@ -179,18 +179,6 @@ unsigned long vgic_mmio_read_pending(struct kvm_vcpu *vcpu,
         return value;
  }
  
-/*
- * This function will return the VCPU that performed the MMIO access and
- * trapped from within the VM, and will return NULL if this is a userspace
- * access.
- *
- * We can disable preemption locally around accessing the per-CPU variable,
- * and use the resolved vcpu pointer after enabling preemption again, because
- * even if the current thread is migrated to another CPU, reading the per-CPU
- * value later will give us the same value as we update the per-CPU variable
- * in the preempt notifier handlers.
- */
-
  /* Must be called with irq->irq_lock held */
  static void vgic_hw_irq_spending(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
                                  bool is_uaccess)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c

index 67ae2d5c37b238749aa153485d6bbbe855f8f108..70f03ce0e5c1d547beaa2d206ce6687f292a3b48 100644 (file)
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -4409,12 +4409,22 @@ static void kvm_sched_out(struct preempt_notifier *pn,
  
  /**
   * kvm_get_running_vcpu - get the vcpu running on the current CPU.
- * Thanks to preempt notifiers, this can also be called from
- * preemptible context.
+ *
+ * We can disable preemption locally around accessing the per-CPU variable,
+ * and use the resolved vcpu pointer after enabling preemption again,
+ * because even if the current thread is migrated to another CPU, reading
+ * the per-CPU value later will give us the same value as we update the
+ * per-CPU variable in the preempt notifier handlers.
   */
  struct kvm_vcpu *kvm_get_running_vcpu(void)
  {
-        return __this_cpu_read(kvm_running_vcpu);
+       struct kvm_vcpu *vcpu;
+
+       preempt_disable();
+       vcpu = __this_cpu_read(kvm_running_vcpu);
+       preempt_enable();
+
+       return vcpu;
  }
  
  /**
author	Arnaldo Carvalho de Melo <acme@redhat.com>
	Wed, 4 Mar 2020 13:29:19 +0000 (10:29 -0300)
committer	Arnaldo Carvalho de Melo <acme@redhat.com>
	Wed, 4 Mar 2020 13:29:19 +0000 (10:29 -0300)