sfrench/cifs-2.6.git
5 years agophy: ocelot-serdes: fix out-of-bounds read
Gustavo A. R. Silva [Fri, 19 Oct 2018 09:21:38 +0000 (11:21 +0200)]
phy: ocelot-serdes: fix out-of-bounds read

Currently, there is an out-of-bounds read on array ctrl->phys,
once variable i reaches the maximum array size of SERDES_MAX
in the for loop.

Fix this by changing the condition in the for loop from
i <= SERDES_MAX to i < SERDES_MAX.

Addresses-Coverity-ID: 1473966 ("Out-of-bounds read")
Addresses-Coverity-ID: 1473959 ("Out-of-bounds read")
Fixes: 51f6b410fc22 ("phy: add driver for Microsemi Ocelot SerDes muxing")
Reviewed-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodt-bindings: phy: Update SERDES_MAX to be SERDES_MAX + 1
Gustavo A. R. Silva [Fri, 19 Oct 2018 09:19:13 +0000 (11:19 +0200)]
dt-bindings: phy: Update SERDES_MAX to be SERDES_MAX + 1

SERDES_MAX is a valid value to index ctrl->phys in
drivers/phy/mscc/phy-ocelot-serdes.c. But, currently,
there is an out-of-bounds bug in the mentioned driver
when reading from ctrl->phys, because the size of
array ctrl->phys is SERDES_MAX.

Partially fix this by updating SERDES_MAX to be SERDES6G_MAX + 1.

Notice that this is the first part of the solution to
the out-of-bounds bug mentioned above. Although this
change is not dependent on any other one.

Suggested-by: Quentin Schulz <quentin.schulz@bootlin.com>
Reviewed-by: Quentin Schulz <quentin.schulz@bootlin.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotipc: use destination length for copy string
Guoqing Jiang [Fri, 19 Oct 2018 04:08:22 +0000 (12:08 +0800)]
tipc: use destination length for copy string

Got below warning with gcc 8.2 compiler.

net/tipc/topsrv.c: In function ‘tipc_topsrv_start’:
net/tipc/topsrv.c:660:2: warning: ‘strncpy’ specified bound depends on the length of the source argument [-Wstringop-overflow=]
  strncpy(srv->name, name, strlen(name) + 1);
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
net/tipc/topsrv.c:660:27: note: length computed here
  strncpy(srv->name, name, strlen(name) + 1);
                           ^~~~~~~~~~~~
So change it to correct length and use strscpy.

Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoisdn: hfc_{pci,sx}: Avoid empty body if statements
Nathan Chancellor [Fri, 19 Oct 2018 01:11:04 +0000 (18:11 -0700)]
isdn: hfc_{pci,sx}: Avoid empty body if statements

Clang warns:

drivers/isdn/hisax/hfc_pci.c:131:34: error: if statement has empty body
[-Werror,-Wempty-body]
        if (Read_hfc(cs, HFCPCI_INT_S1));
                                        ^
drivers/isdn/hisax/hfc_pci.c:131:34: note: put the semicolon on a
separate line to silence this warning

In my attempt to hide the warnings because I thought they didn't serve
any purpose[1], Masahiro Yamada pointed out that {Read,Write}_hfc in
hci_pci.c should be using a standard register access method; otherwise,
the compiler will just remove the if statements.

For hfc_pci, use the versions of {Read,Write}_hfc found in
drivers/isdn/hardware/mISDN/hfc_pCI.h while converting pci_io to be
'void __iomem *' (and clean up ioremap) then remove the empty if
statements.

For hfc_sx, {Read,Write}_hfc are already use a proper register accessor
(inb, outb) so just remove the unnecessary if statements.

[1]: https://lore.kernel.org/lkml/20181016021454.11953-1-natechancellor@gmail.com/

Link: https://github.com/ClangBuiltLinux/linux/issues/66
Suggested-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
David S. Miller [Mon, 22 Oct 2018 04:11:46 +0000 (21:11 -0700)]
Merge git://git./linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2018-10-21

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Implement two new kind of BPF maps, that is, queue and stack
   map along with new peek, push and pop operations, from Mauricio.

2) Add support for MSG_PEEK flag when redirecting into an ingress
   psock sk_msg queue, and add a new helper bpf_msg_push_data() for
   insert data into the message, from John.

3) Allow for BPF programs of type BPF_PROG_TYPE_CGROUP_SKB to use
   direct packet access for __skb_buff, from Song.

4) Use more lightweight barriers for walking perf ring buffer for
   libbpf and perf tool as well. Also, various fixes and improvements
   from verifier side, from Daniel.

5) Add per-symbol visibility for DSO in libbpf and hide by default
   global symbols such as netlink related functions, from Andrey.

6) Two improvements to nfp's BPF offload to check vNIC capabilities
   in case prog is shared with multiple vNICs and to protect against
   mis-initializing atomic counters, from Jakub.

7) Fix for bpftool to use 4 context mode for the nfp disassembler,
   also from Jakub.

8) Fix a return value comparison in test_libbpf.sh and add several
   bpftool improvements in bash completion, documentation of bpf fs
   restrictions and batch mode summary print, from Quentin.

9) Fix a file resource leak in BPF selftest's load_kallsyms()
   helper, from Peng.

10) Fix an unused variable warning in map_lookup_and_delete_elem(),
    from Alexei.

11) Fix bpf_skb_adjust_room() signature in BPF UAPI helper doc,
    from Nicolas.

12) Add missing executables to .gitignore in BPF selftests, from Anders.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-simplify-getting-driver_data'
David S. Miller [Mon, 22 Oct 2018 04:10:12 +0000 (21:10 -0700)]
Merge branch 'net-simplify-getting-driver_data'

Wolfram Sang says:

====================
net: simplify getting .driver_data

I got tired of fixing this in Renesas drivers manually, so I took the big
hammer. Remove this cumbersome code pattern which got copy-pasted too much
already:

- struct platform_device *pdev = to_platform_device(dev);
- struct ep93xx_keypad *keypad = platform_get_drvdata(pdev);
+ struct ep93xx_keypad *keypad = dev_get_drvdata(dev);

A branch, tested by buildbot, can be found here:

git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux.git coccinelle/get_drvdata

I have been asked if it couldn't be done for dev_set_drvdata as well. I checked
it and did not find one occasion where it could be simplified like this. Not
much of a surprise because driver_data is usually set in probe() functions
which access struct platform_device in many other ways.

I am open for other comments, suggestions, too, of course.

Here is the cocci-script I created:

@@
struct device* d;
identifier pdev;
expression *ptr;
@@
(
- struct platform_device *pdev = to_platform_device(d);
|
- struct platform_device *pdev;
...
- pdev = to_platform_device(d);
)
<... when != pdev
- &pdev->dev
+ d
...>

ptr =
- platform_get_drvdata(pdev)
+ dev_get_drvdata(d)

<... when != pdev
- &pdev->dev
+ d
...>
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: mdio-mux-bcm-iproc: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:00:20 +0000 (22:00 +0200)]
net: phy: mdio-mux-bcm-iproc: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ethernet: wiznet: w5300: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:00:19 +0000 (22:00 +0200)]
net: ethernet: wiznet: w5300: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ethernet: ti: davinci_emac: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:00:18 +0000 (22:00 +0200)]
net: ethernet: ti: davinci_emac: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ethernet: ti: cpsw: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:00:17 +0000 (22:00 +0200)]
net: ethernet: ti: cpsw: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ethernet: smsc: smc91x: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:00:16 +0000 (22:00 +0200)]
net: ethernet: smsc: smc91x: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ethernet: davicom: dm9000: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:00:15 +0000 (22:00 +0200)]
net: ethernet: davicom: dm9000: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ethernet: cadence: macb_main: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:00:14 +0000 (22:00 +0200)]
net: ethernet: cadence: macb_main: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: qca8k: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:00:13 +0000 (22:00 +0200)]
net: dsa: qca8k: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: bcm_sf2: simplify getting .driver_data
Wolfram Sang [Sun, 21 Oct 2018 20:00:12 +0000 (22:00 +0200)]
net: dsa: bcm_sf2: simplify getting .driver_data

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Sun, 21 Oct 2018 18:54:28 +0000 (11:54 -0700)]
Merge git://git./linux/kernel/git/davem/net

David Ahern's dump indexing bug fix in 'net' overlapped the
change of the function signature of inet6_fill_ifaddr() in
'net-next'.  Trivially resolved.

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotools: bpftool: fix completion for "bpftool map update"
Quentin Monnet [Sat, 20 Oct 2018 22:01:50 +0000 (23:01 +0100)]
tools: bpftool: fix completion for "bpftool map update"

When trying to complete "bpftool map update" commands, the call to
printf would print an error message that would show on the command line
if no map is found to complete the command line.

Fix it by making sure we have map ids to complete the line with, before
we try to print something.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agotools: bpftool: print nb of cmds to stdout (not stderr) for batch mode
Quentin Monnet [Sat, 20 Oct 2018 22:01:49 +0000 (23:01 +0100)]
tools: bpftool: print nb of cmds to stdout (not stderr) for batch mode

When batch mode is used and all commands succeeds, bpftool prints the
number of commands processed to stderr. There is no particular reason to
use stderr for this, we could as well use stdout. It would avoid getting
unnecessary output on stderr if the standard ouptut is redirected, for
example.

Reported-by: David Beckett <david.beckett@netronome.com>
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agotools: bpftool: document restriction on '.' in names to pin in bpffs
Quentin Monnet [Sat, 20 Oct 2018 22:01:48 +0000 (23:01 +0100)]
tools: bpftool: document restriction on '.' in names to pin in bpffs

Names used to pin eBPF programs and maps under the eBPF virtual file
system cannot contain a dot character, which is reserved for future
extensions of this file system.

Document this in bpftool man pages to avoid users getting confused if
pinning fails because of a dot.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Greg Kroah-Hartman [Sun, 21 Oct 2018 08:08:38 +0000 (10:08 +0200)]
Merge git://git./linux/kernel/git/davem/net

David writes:
  "Networking:

   A few straggler bug fixes:

   1) Fix indexing of multi-pass dumps of ipv6 addresses, from David
      Ahern.

   2) Revert RCU locking change for bonding netpoll, causes worse
      problems than it solves.

   3) pskb_trim_rcsum_slow() doesn't handle odd trim offsets, resulting
      in erroneous bad hw checksum triggers with CHECKSUM_COMPLETE
      devices.  From Dimitris Michailidis.

   4) a revert to some neighbour code changes that adjust notifications
      in a way that confuses some apps."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  Revert "neighbour: force neigh_invalidate when NUD_FAILED update is from admin"
  net/ipv6: Fix index counter for unicast addresses in in6_dump_addrs
  net: fix pskb_trim_rcsum_slow() with odd trim offset
  Revert "bond: take rcu lock in netpoll_send_skb_on_dev"

5 years agoselftests/bpf: fix return value comparison for tests in test_libbpf.sh
Quentin Monnet [Sat, 20 Oct 2018 21:58:44 +0000 (22:58 +0100)]
selftests/bpf: fix return value comparison for tests in test_libbpf.sh

The return value for each test in test_libbpf.sh is compared with

    if (( $? == 0 )) ; then ...

This works well with bash, but not with dash, that /bin/sh is aliased to
on some systems (such as Ubuntu).

Let's replace this comparison by something that works on both shells.

Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoMerge branch 'misc-improvements'
Alexei Starovoitov [Sun, 21 Oct 2018 06:13:33 +0000 (23:13 -0700)]
Merge branch 'misc-improvements'

Daniel Borkmann says:

====================
Last batch of misc patches I had in queue: first one removes some left-over
bits from ULP, second is a fix in the verifier where we wrongly use register
number as type to fetch the string for the dump, third disables xadd on flow
keys and subsequent one removes the flow key type from check_helper_mem_access()
as they cannot be passed into any helper as of today. Next one lets map push,
pop, peek avoid having to go through retpoline, and last one has a couple of
minor fixes and cleanups for the ring buffer walk.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf, libbpf: simplify and cleanup perf ring buffer walk
Daniel Borkmann [Sun, 21 Oct 2018 00:09:28 +0000 (02:09 +0200)]
bpf, libbpf: simplify and cleanup perf ring buffer walk

Simplify bpf_perf_event_read_simple() a bit and fix up some minor
things along the way: the return code in the header is not of type
int but enum bpf_perf_event_ret instead. Once callback indicated
to break the loop walking event data, it also needs to be consumed
in data_tail since it has been processed already.

Moreover, bpf_perf_event_print_t callback should avoid void * as
we actually get a pointer to struct perf_event_header and thus
applications can make use of container_of() to have type checks.
The walk also doesn't have to use modulo op since the ring size is
required to be power of two.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf, verifier: avoid retpoline for map push/pop/peek operation
Daniel Borkmann [Sun, 21 Oct 2018 00:09:27 +0000 (02:09 +0200)]
bpf, verifier: avoid retpoline for map push/pop/peek operation

Extend prior work from 09772d92cd5a ("bpf: avoid retpoline for
lookup/update/delete calls on maps") to also apply to the recently
added map helpers that perform push/pop/peek operations so that
the indirect call can be avoided.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf, verifier: remove unneeded flow key in check_helper_mem_access
Daniel Borkmann [Sun, 21 Oct 2018 00:09:26 +0000 (02:09 +0200)]
bpf, verifier: remove unneeded flow key in check_helper_mem_access

They PTR_TO_FLOW_KEYS is not used today to be passed into a helper
as memory, so it can be removed from check_helper_mem_access().

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf, verifier: reject xadd on flow key memory
Daniel Borkmann [Sun, 21 Oct 2018 00:09:25 +0000 (02:09 +0200)]
bpf, verifier: reject xadd on flow key memory

We should not enable xadd operation for flow key memory if not
needed there anyway. There is no such issue as described in the
commit f37a8cb84cce ("bpf: reject stores into ctx via st and xadd")
since there's no context rewriter for flow keys today, but it
also shouldn't become part of the user facing behavior to allow
for it. After patch:

  0: (79) r7 = *(u64 *)(r1 +144)
  1: (b7) r3 = 4096
  2: (db) lock *(u64 *)(r7 +0) += r3
  BPF_XADD stores into R7 flow_keys is not allowed

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf, verifier: fix register type dump in xadd and st
Daniel Borkmann [Sun, 21 Oct 2018 00:09:24 +0000 (02:09 +0200)]
bpf, verifier: fix register type dump in xadd and st

Using reg_type_str[insn->dst_reg] is incorrect since insn->dst_reg
contains the register number but not the actual register type. Add
a small reg_state() helper and use it to get to the type. Also fix
up the test_verifier test cases that have an incorrect errstr.

Fixes: 9d2be44a7f33 ("bpf: Reuse canonical string formatter for ctx errs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoulp: remove uid and user_visible members
Daniel Borkmann [Sun, 21 Oct 2018 00:09:23 +0000 (02:09 +0200)]
ulp: remove uid and user_visible members

They are not used anymore and therefore should be removed.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoRevert "neighbour: force neigh_invalidate when NUD_FAILED update is from admin"
Roopa Prabhu [Sun, 21 Oct 2018 01:09:31 +0000 (18:09 -0700)]
Revert "neighbour: force neigh_invalidate when NUD_FAILED update is from admin"

This reverts commit 8e326289e3069dfc9fa9c209924668dd031ab8ef.

This patch results in unnecessary netlink notification when one
tries to delete a neigh entry already in NUD_FAILED state. Found
this with a buggy app that tries to delete a NUD_FAILED entry
repeatedly. While the notification issue can be fixed with more
checks, adding more complexity here seems unnecessary. Also,
recent tests with other changes in the neighbour code have
shown that the INCOMPLETE and PROBE checks are good enough for
the original issue.

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/ipv6: Fix index counter for unicast addresses in in6_dump_addrs
David Ahern [Fri, 19 Oct 2018 17:00:19 +0000 (10:00 -0700)]
net/ipv6: Fix index counter for unicast addresses in in6_dump_addrs

The loop wants to skip previously dumped addresses, so loops until
current index >= saved index. If the message fills it wants to save
the index for the next address to dump - ie., the one that did not
fit in the current message.

Currently, it is incrementing the index counter before comparing to the
saved index, and then the saved index is off by 1 - it assumes the
current address is going to fit in the message.

Change the index handling to increment only after a succesful dump.

Fixes: 502a2ffd7376a ("ipv6: convert idev_list to list macros")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'bpf-msg-push-data'
Daniel Borkmann [Sat, 20 Oct 2018 19:37:12 +0000 (21:37 +0200)]
Merge branch 'bpf-msg-push-data'

John Fastabend says:

====================
This series adds a new helper bpf_msg_push_data to be used by
sk_msg programs. The helper can be used to insert extra bytes into
the message that can then be used by the program as metadata tags
among other things.

The first patch adds the helper, second patch the libbpf support,
and last patch updates test_sockmap to run msg_push_data tests.

v2: rebase after queue map and in filter.c convert int -> u32
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agobpf: test_sockmap add options to use msg_push_data
John Fastabend [Sat, 20 Oct 2018 02:56:51 +0000 (19:56 -0700)]
bpf: test_sockmap add options to use msg_push_data

Add options to run msg_push_data, this patch creates two more flags
in test_sockmap that can be used to specify the offset and length
of bytes to be added. The new options are --txmsg_start_push to
specify where bytes should be inserted and --txmsg_end_push to
specify how many bytes. This is analagous to the options that are
used to pull data, --txmsg_start and --txmsg_end.

In addition to adding the options tests are added to the test
suit to run the tests similar to what was done for msg_pull_data.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agobpf: libbpf support for msg_push_data
John Fastabend [Sat, 20 Oct 2018 02:56:50 +0000 (19:56 -0700)]
bpf: libbpf support for msg_push_data

Add support for new bpf_msg_push_data in libbpf.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agobpf: sk_msg program helper bpf_msg_push_data
John Fastabend [Sat, 20 Oct 2018 02:56:49 +0000 (19:56 -0700)]
bpf: sk_msg program helper bpf_msg_push_data

This allows user to push data into a msg using sk_msg program types.
The format is as follows,

bpf_msg_push_data(msg, offset, len, flags)

this will insert 'len' bytes at offset 'offset'. For example to
prepend 10 bytes at the front of the message the user can,

bpf_msg_push_data(msg, 0, 10, 0);

This will invalidate data bounds so BPF user will have to then recheck
data bounds after calling this. After this the msg size will have been
updated and the user is free to write into the added bytes. We allow
any offset/len as long as it is within the (data, data_end) range.
However, a copy will be required if the ring is full and its possible
for the helper to fail with ENOMEM or EINVAL errors which need to be
handled by the BPF program.

This can be used similar to XDP metadata to pass data between sk_msg
layer and lower layers.

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agor8169: add support for Byte Queue Limits
Florian Westphal [Sat, 20 Oct 2018 10:25:27 +0000 (12:25 +0200)]
r8169: add support for Byte Queue Limits

This patch is basically a resubmit of 1e918876853a ("r8169: add support
for Byte Queue Limits") which was reverted later. The problems causing
the revert seem to have been fixed in the meantime.
Only change to the original patch is that the call to
netdev_reset_queue was moved to rtl8169_tx_clear.

The Tested-by refers to a system using the RTL8168evl chip version.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: handle all interrupt events in the hard irq handler
Heiner Kallweit [Thu, 18 Oct 2018 20:19:28 +0000 (22:19 +0200)]
r8169: handle all interrupt events in the hard irq handler

Having a separate "slow event" handler isn't needed because all
interrupt events trigger asynchronous activity. And in case of SYSErr
we have bigger problems than performance anyway.
This patch also allows to get rid of acking interrupt events in the
NAPI poll callback.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetoot...
David S. Miller [Sat, 20 Oct 2018 19:33:48 +0000 (12:33 -0700)]
Merge branch 'for-upstream' of git://git./linux/kernel/git/bluetooth/bluetooth-next

Johan Hedberg says:

====================
pull request: bluetooth-next 2018-10-20

Here's one more bluetooth-next pull request for the 4.20 kernel.

 - Added new USB ID for QCA_ROME controller
 - Added debug trace support from QCA wcn3990 controllers
 - Updated L2CAP to conform to latest Errata Service Release
 - Fix binding to non-removable BCM43430 devices

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
David S. Miller [Sat, 20 Oct 2018 19:32:44 +0000 (12:32 -0700)]
Merge git://git./linux/kernel/git/pablo/nf-next

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for your net-next tree:

1) Use lockdep_is_held() in ipset_dereference_protected(), from Lance Roy.

2) Remove unused variable in cttimeout, from YueHaibing.

3) Add ttl option for nft_osf, from Fernando Fernandez Mancera.

4) Use xfrm family to deal with IPv6-in-IPv4 packets from nft_xfrm,
   from Florian Westphal.

5) Simplify xt_osf_match_packet().

6) Missing ct helper alias definition in snmp_trap helper, from Taehee Yoo.

7) Remove unnecessary parameter in nf_flow_table_cleanup(), from Taehee Yoo.

8) Remove unused variable definitions in nft_{dup,fwd}, from Weongyo Jeong.

9) Remove empty net/netfilter/nfnetlink_log.h file, from Taehee Yoo.

10) Revert xt_quota updates remain option due to problems in the listing
    path for 32-bit arches, from Maze.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Greg Kroah-Hartman [Sat, 20 Oct 2018 13:04:23 +0000 (15:04 +0200)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Ingo writes:
  "x86 fixes:

   It's 4 misc fixes, 3 build warning fixes and 3 comment fixes.

   In hindsight I'd have left out the 3 comment fixes to make the pull
   request look less scary at such a late point in the cycle. :-/"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/swiotlb: Enable swiotlb for > 4GiG RAM on 32-bit kernels
  x86/fpu: Fix i486 + no387 boot crash by only saving FPU registers on context switch if there is an FPU
  x86/fpu: Remove second definition of fpu in __fpu__restore_sig()
  x86/entry/64: Further improve paranoid_entry comments
  x86/entry/32: Clear the CS high bits
  x86/boot: Add -Wno-pointer-sign to KBUILD_CFLAGS
  x86/time: Correct the attribute on jiffies' definition
  x86/entry: Add some paranoid entry/exit CR3 handling comments
  x86/percpu: Fix this_cpu_read()
  x86/tsc: Force inlining of cyc2ns bits

5 years agoMerge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Greg Kroah-Hartman [Sat, 20 Oct 2018 13:03:45 +0000 (15:03 +0200)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Ingo writes:
  "scheduler fixes:

   Two fixes: a CFS-throttling bug fix, and an interactivity fix."

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Fix the min_vruntime update logic in dequeue_entity()
  sched/fair: Fix throttle_list starvation with low CFS quota

5 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Greg Kroah-Hartman [Sat, 20 Oct 2018 13:02:51 +0000 (15:02 +0200)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Ingo writes:
  "perf fixes:

   Misc perf tooling fixes."

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf tools: Stop fallbacking to kallsyms for vdso symbols lookup
  perf tools: Pass build flags to traceevent build
  perf report: Don't crash on invalid inline debug information
  perf cpu_map: Align cpu map synthesized events properly.
  perf tools: Fix tracing_path_mount proper path
  perf tools: Fix use of alternatives to find JDIR
  perf evsel: Store ids for events with their own cpus perf_event__synthesize_event_update_cpus
  perf vendor events intel: Fix wrong filter_band* values for uncore events
  Revert "perf tools: Fix PMU term format max value calculation"
  tools headers uapi: Sync kvm.h copy
  tools arch uapi: Sync the x86 kvm.h copy

5 years agonet: fix pskb_trim_rcsum_slow() with odd trim offset
Dimitris Michailidis [Sat, 20 Oct 2018 00:07:13 +0000 (17:07 -0700)]
net: fix pskb_trim_rcsum_slow() with odd trim offset

We've been getting checksum errors involving small UDP packets, usually
59B packets with 1 extra non-zero padding byte. netdev_rx_csum_fault()
has been complaining that HW is providing bad checksums. Turns out the
problem is in pskb_trim_rcsum_slow(), introduced in commit 88078d98d1bb
("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends").

The source of the problem is that when the bytes we are trimming start
at an odd address, as in the case of the 1 padding byte above,
skb_checksum() returns a byte-swapped value. We cannot just combine this
with skb->csum using csum_sub(). We need to use csum_block_sub() here
that takes into account the parity of the start address and handles the
swapping.

Matches existing code in __skb_postpull_rcsum() and esp_remove_trailer().

Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends")
Signed-off-by: Dimitris Michailidis <dmichail@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: loopback: clear skb->tstamp before netif_rx()
Eric Dumazet [Sat, 20 Oct 2018 02:11:26 +0000 (19:11 -0700)]
net: loopback: clear skb->tstamp before netif_rx()

At least UDP / TCP stacks can now cook skbs with a tstamp using
MONOTONIC base (or arbitrary values with SCM_TXTIME)

Since loopback driver does not call (directly or indirectly)
skb_scrub_packet(), we need to clear skb->tstamp so that
net_timestamp_check() can eventually resample the time,
using ktime_get_real().

Fixes: 80b14dee2bea ("net: Add a new socket option for a future transmit time.")
Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'drm-fixes-2018-10-20-1' of git://anongit.freedesktop.org/drm/drm
Greg Kroah-Hartman [Sat, 20 Oct 2018 07:23:12 +0000 (09:23 +0200)]
Merge tag 'drm-fixes-2018-10-20-1' of git://anongit.freedesktop.org/drm/drm

Dave writes:
  "drm fixes for 4.19 final (part 2)

   Looked like two stragglers snuck in, one very urgent the pageflipping
   was missing a reference that could result in a GPF on non-i915
   drivers, the other is an overflow in the sun4i dotclock calcs
   resulting in a mode not getting set."

* tag 'drm-fixes-2018-10-20-1' of git://anongit.freedesktop.org/drm/drm:
  drm/sun4i: Fix an ulong overflow in the dotclock driver
  drm: Get ref on CRTC commit object when waiting for flip_done

5 years agoMerge tag 'trace-v4.19-rc8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rosted...
Greg Kroah-Hartman [Sat, 20 Oct 2018 07:20:48 +0000 (09:20 +0200)]
Merge tag 'trace-v4.19-rc8-2' of git://git./linux/kernel/git/rostedt/linux-trace

Steven writes:
  "tracing: A few small fixes to synthetic events

   Masami found some issues with the creation of synthetic events.  The
   first two patches fix handling of unsigned type, and handling of a
   space before an ending semi-colon.

   The third patch adds a selftest to test the processing of synthetic
   events."

* tag 'trace-v4.19-rc8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  selftests: ftrace: Add synthetic event syntax testcase
  tracing: Fix synthetic event to allow semicolon at end
  tracing: Fix synthetic event to accept unsigned modifier

5 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Greg Kroah-Hartman [Sat, 20 Oct 2018 06:42:56 +0000 (08:42 +0200)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

Dmitry writes:
  "Input updates for 4.19-rc8

   Just an addition to elan touchpad driver ACPI table."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: elan_i2c - add ACPI ID for Lenovo IdeaPad 330-15IGM

5 years agoMerge tag 'drm-misc-fixes-2018-10-19' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Fri, 19 Oct 2018 21:18:12 +0000 (07:18 +1000)]
Merge tag 'drm-misc-fixes-2018-10-19' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

Second pull request for v4.19:
- Fix ulong overflow in sun4i
- Fix a serious GPF in waiting for flip_done from commit_tail().

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/97d1ed42-1d99-fcc5-291e-cd1dc29a4252@linux.intel.com
5 years agonet: ethernet: lpc_eth: add device and device node local variables
Vladimir Zapolskiy [Thu, 18 Oct 2018 23:25:11 +0000 (02:25 +0300)]
net: ethernet: lpc_eth: add device and device node local variables

Trivial non-functional change added to simplify getting multiple
references to device pointer in lpc_eth_drv_probe().

Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ethernet: lpc_eth: remove unused local variable
Vladimir Zapolskiy [Thu, 18 Oct 2018 23:06:53 +0000 (02:06 +0300)]
net: ethernet: lpc_eth: remove unused local variable

A trivial change which removes an unused local variable, the issue
is reported as a compile time warning:

  drivers/net/ethernet/nxp/lpc_eth.c: In function 'lpc_eth_drv_probe':
  drivers/net/ethernet/nxp/lpc_eth.c:1250:21: warning: variable 'phydev' set but not used [-Wunused-but-set-variable]
    struct phy_device *phydev;
                       ^~~~~~

Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ethernet: lpc_eth: remove CONFIG_OF guard from the driver
Vladimir Zapolskiy [Thu, 18 Oct 2018 22:58:41 +0000 (01:58 +0300)]
net: ethernet: lpc_eth: remove CONFIG_OF guard from the driver

The MAC controller device is available on NXP LPC32xx platform only,
and the LPC32xx platform supports OF builds only, so additional
checks in the device driver are not needed.

Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ethernet: lpc_eth: clean up the list of included headers
Vladimir Zapolskiy [Thu, 18 Oct 2018 22:53:25 +0000 (01:53 +0300)]
net: ethernet: lpc_eth: clean up the list of included headers

The change removes all unnecessary included headers from the driver
source code, the remaining list is sorted in alphabetical order.

Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'Microchip-Technology-KSZ9131'
David S. Miller [Sat, 20 Oct 2018 00:02:24 +0000 (17:02 -0700)]
Merge branch 'Microchip-Technology-KSZ9131'

Yuiko Oshino says:

====================
Add support for Microchip Technology KSZ9131 10/100/1000 Ethernet PHY

This is the initial driver for Microchip KSZ9131 10/100/1000 Ethernet PHY

v3:
- KSZ9131 uses picosecond units for values of devicetree properties.
- rewrite micrel.c and micrel-ksz90x1.txt to use the picosecond values.
v2:
- Creating a series from two related patches.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodt-bindings: net: add support for Microchip KSZ9131
Yuiko Oshino [Thu, 18 Oct 2018 19:06:02 +0000 (15:06 -0400)]
dt-bindings: net: add support for Microchip KSZ9131

Add support for Microchip Technology KSZ9131 10/100/1000 Ethernet PHY

Signed-off-by: Yuiko Oshino <yuiko.oshino@microchip.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: micrel: add Microchip KSZ9131 initial driver
Yuiko Oshino [Thu, 18 Oct 2018 19:06:01 +0000 (15:06 -0400)]
net: phy: micrel: add Microchip KSZ9131 initial driver

Add support for Microchip Technology KSZ9131 10/100/1000 Ethernet PHY

Signed-off-by: Yuiko Oshino <yuiko.oshino@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonetpoll: allow cleanup to be synchronous
Debabrata Banerjee [Thu, 18 Oct 2018 15:18:26 +0000 (11:18 -0400)]
netpoll: allow cleanup to be synchronous

This fixes a problem introduced by:
commit 2cde6acd49da ("netpoll: Fix __netpoll_rcu_free so that it can hold the rtnl lock")

When using netconsole on a bond, __netpoll_cleanup can asynchronously
recurse multiple times, each __netpoll_free_async call can result in
more __netpoll_free_async's. This means there is now a race between
cleanup_work queues on multiple netpoll_info's on multiple devices and
the configuration of a new netpoll. For example if a netconsole is set
to enable 0, reconfigured, and enable 1 immediately, this netconsole
will likely not work.

Given the reason for __netpoll_free_async is it can be called when rtnl
is not locked, if it is locked, we should be able to execute
synchronously. It appears to be locked everywhere it's called from.

Generalize the design pattern from the teaming driver for current
callers of __netpoll_free_async.

CC: Neil Horman <nhorman@tuxdriver.com>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Debabrata Banerjee <dbanerje@akamai.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobpf: skmsg, fix psock create on existing kcm/tls port
John Fastabend [Thu, 18 Oct 2018 20:58:35 +0000 (13:58 -0700)]
bpf: skmsg, fix psock create on existing kcm/tls port

Before using the psock returned by sk_psock_get() when adding it to a
sockmap we need to ensure it is actually a sockmap based psock.
Previously we were only checking this after incrementing the reference
counter which was an error. This resulted in a slab-out-of-bounds
error when the psock was not actually a sockmap type.

This moves the check up so the reference counter is only used
if it is a sockmap psock.

Eric reported the following KASAN BUG,

BUG: KASAN: slab-out-of-bounds in atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
BUG: KASAN: slab-out-of-bounds in refcount_inc_not_zero_checked+0x97/0x2f0 lib/refcount.c:120
Read of size 4 at addr ffff88019548be58 by task syz-executor4/22387

CPU: 1 PID: 22387 Comm: syz-executor4 Not tainted 4.19.0-rc7+ #264
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113
 print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256
 kasan_report_error mm/kasan/report.c:354 [inline]
 kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412
 check_memory_region_inline mm/kasan/kasan.c:260 [inline]
 check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
 kasan_check_read+0x11/0x20 mm/kasan/kasan.c:272
 atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
 refcount_inc_not_zero_checked+0x97/0x2f0 lib/refcount.c:120
 sk_psock_get include/linux/skmsg.h:379 [inline]
 sock_map_link.isra.6+0x41f/0xe30 net/core/sock_map.c:178
 sock_hash_update_common+0x19b/0x11e0 net/core/sock_map.c:669
 sock_hash_update_elem+0x306/0x470 net/core/sock_map.c:738
 map_update_elem+0x819/0xdf0 kernel/bpf/syscall.c:818

Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agoselftests: ftrace: Add synthetic event syntax testcase
Masami Hiramatsu [Thu, 18 Oct 2018 13:13:02 +0000 (22:13 +0900)]
selftests: ftrace: Add synthetic event syntax testcase

Add a testcase to check the syntax and field types for
synthetic_events interface.

Link: http://lkml.kernel.org/r/153986838264.18251.16627517536956299922.stgit@devbox
Acked-by: Shuah Khan <shuah@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
5 years agotracing: Fix synthetic event to allow semicolon at end
Masami Hiramatsu [Thu, 18 Oct 2018 13:12:34 +0000 (22:12 +0900)]
tracing: Fix synthetic event to allow semicolon at end

Fix synthetic event to allow independent semicolon at end.

The synthetic_events interface accepts a semicolon after the
last word if there is no space.

 # echo "myevent u64 var;" >> synthetic_events

But if there is a space, it returns an error.

 # echo "myevent u64 var ;" > synthetic_events
 sh: write error: Invalid argument

This behavior is difficult for users to understand. Let's
allow the last independent semicolon too.

Link: http://lkml.kernel.org/r/153986835420.18251.2191216690677025744.stgit@devbox
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: stable@vger.kernel.org
Fixes: commit 4b147936fa50 ("tracing: Add support for 'synthetic' events")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
5 years agotracing: Fix synthetic event to accept unsigned modifier
Masami Hiramatsu [Thu, 18 Oct 2018 13:12:05 +0000 (22:12 +0900)]
tracing: Fix synthetic event to accept unsigned modifier

Fix synthetic event to accept unsigned modifier for its field type
correctly.

Currently, synthetic_events interface returns error for "unsigned"
modifiers as below;

 # echo "myevent unsigned long var" >> synthetic_events
 sh: write error: Invalid argument

This is because argv_split() breaks "unsigned long" into "unsigned"
and "long", but parse_synth_field() doesn't expected it.

With this fix, synthetic_events can handle the "unsigned long"
correctly like as below;

 # echo "myevent unsigned long var" >> synthetic_events
 # cat synthetic_events
 myevent unsigned long var

Link: http://lkml.kernel.org/r/153986832571.18251.8448135724590496531.stgit@devbox
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: stable@vger.kernel.org
Fixes: commit 4b147936fa50 ("tracing: Add support for 'synthetic' events")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
5 years agobpf: remove unused variable
Alexei Starovoitov [Fri, 19 Oct 2018 20:52:38 +0000 (13:52 -0700)]
bpf: remove unused variable

fix the following warning
../kernel/bpf/syscall.c: In function ‘map_lookup_and_delete_elem’:
../kernel/bpf/syscall.c:1010:22: warning: unused variable ‘ptr’ [-Wunused-variable]
  void *key, *value, *ptr;
                      ^~~

Fixes: bd513cd08f10 ("bpf: add MAP_LOOKUP_AND_DELETE_ELEM syscall")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoMerge branch 'cg_skb_direct_pkt_access'
Alexei Starovoitov [Fri, 19 Oct 2018 20:49:35 +0000 (13:49 -0700)]
Merge branch 'cg_skb_direct_pkt_access'

Song Liu says:

====================
Changes v7 -> v8:
1. Dynamically allocate the dummy sk to avoid race conditions.

Changes v6 -> v7:
1. Make dummy sk a global variable (test_run_sk).

Changes v5 -> v6:
1. Fixed dummy sk in bpf_prog_test_run_skb() as suggested by Eric Dumazet.

Changes v4 -> v5:
1. Replaced bpf_compute_and_save_data_pointers() with
   bpf_compute_and_save_data_end();
   Replaced bpf_restore_data_pointers() with bpf_restore_data_end().
2. Fixed indentation in test_verifier.c

Changes v3 -> v4:
1. Fixed crash issue reported by Alexei.

Changes v2 -> v3:
1. Added helper function bpf_compute_and_save_data_pointers() and
   bpf_restore_data_pointers().

Changes v1 -> v2:
1. Updated the list of read-only fields, and read-write fields.
2. Added dummy sk to bpf_prog_test_run_skb().

This set enables BPF program of type BPF_PROG_TYPE_CGROUP_SKB to access
some __skb_buff data directly.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf: add tests for direct packet access from CGROUP_SKB
Song Liu [Fri, 19 Oct 2018 16:57:58 +0000 (09:57 -0700)]
bpf: add tests for direct packet access from CGROUP_SKB

Tests are added to make sure CGROUP_SKB cannot access:
  tc_classid, data_meta, flow_keys

and can read and write:
  mark, prority, and cb[0-4]

and can read other fields.

To make selftest with skb->sk work, a dummy sk is added in
bpf_prog_test_run_skb().

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf: add cg_skb_is_valid_access for BPF_PROG_TYPE_CGROUP_SKB
Song Liu [Fri, 19 Oct 2018 16:57:57 +0000 (09:57 -0700)]
bpf: add cg_skb_is_valid_access for BPF_PROG_TYPE_CGROUP_SKB

BPF programs of BPF_PROG_TYPE_CGROUP_SKB need to access headers in the
skb. This patch enables direct access of skb for these programs.

Two helper functions bpf_compute_and_save_data_end() and
bpf_restore_data_end() are introduced. There are used in
__cgroup_bpf_run_filter_skb(), to compute proper data_end for the
BPF program, and restore original data afterwards.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoMerge branch 'improve_perf_barriers'
Alexei Starovoitov [Fri, 19 Oct 2018 20:43:09 +0000 (13:43 -0700)]
Merge branch 'improve_perf_barriers'

Daniel Borkmann says:

====================
This set first adds smp_* barrier variants to tools infrastructure
and updates perf and libbpf to make use of them. For details, please
see individual patches, thanks!

Arnaldo, if there are no objections, could this be routed via bpf-next
with Acked-by's due to later dependencies in libbpf? Alternatively,
I could also get the 2nd patch out during merge window, but perhaps
it's okay to do in one go as there shouldn't be much conflict in perf
itself.

Thanks!

v1 -> v2:
  - add common helper and switch to acquire/release variants
    when possible, thanks Peter!
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf, libbpf: use correct barriers in perf ring buffer walk
Daniel Borkmann [Fri, 19 Oct 2018 13:51:03 +0000 (15:51 +0200)]
bpf, libbpf: use correct barriers in perf ring buffer walk

Given libbpf is a generic library and not restricted to x86-64 only,
the compiler barrier in bpf_perf_event_read_simple() after fetching
the head needs to be replaced with smp_rmb() at minimum. Also, writing
out the tail we should use WRITE_ONCE() to avoid store tearing.

Now that we have the logic in place in ring_buffer_read_head() and
ring_buffer_write_tail() helper also used by perf tool which would
select the correct and best variant for a given architecture (e.g.
x86-64 can avoid CPU barriers entirely), make use of these in order
to fix bpf_perf_event_read_simple().

Fixes: d0cabbb021be ("tools: bpf: move the event reading loop to libbpf")
Fixes: 39111695b1b8 ("samples: bpf: add bpf_perf_event_output example")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agotools, perf: add and use optimized ring_buffer_{read_head, write_tail} helpers
Daniel Borkmann [Fri, 19 Oct 2018 13:51:02 +0000 (15:51 +0200)]
tools, perf: add and use optimized ring_buffer_{read_head, write_tail} helpers

Currently, on x86-64, perf uses LFENCE and MFENCE (rmb() and mb(),
respectively) when processing events from the perf ring buffer which
is unnecessarily expensive as we can do more lightweight in particular
given this is critical fast-path in perf.

According to Peter rmb()/mb() were added back then via a94d342b9cb0
("tools/perf: Add required memory barriers") at a time where kernel
still supported chips that needed it, but nowadays support for these
has been ditched completely, therefore we can fix them up as well.

While for x86-64, replacing rmb() and mb() with smp_*() variants would
result in just a compiler barrier for the former and LOCK + ADD for
the latter (__sync_synchronize() uses slower MFENCE by the way), Peter
suggested we can use smp_{load_acquire,store_release}() instead for
architectures where its implementation doesn't resolve in slower smp_mb().
Thus, e.g. in x86-64 we would be able to avoid CPU barrier entirely due
to TSO. For architectures where the latter needs to use smp_mb() e.g.
on arm, we stick to cheaper smp_rmb() variant for fetching the head.

This work adds helpers ring_buffer_read_head() and ring_buffer_write_tail()
for tools infrastructure that either switches to smp_load_acquire() for
architectures where it is cheaper or uses READ_ONCE() + smp_rmb() barrier
for those where it's not in order to fetch the data_head from the perf
control page, and it uses smp_store_release() to write the data_tail.
Latter is smp_mb() + WRITE_ONCE() combination or a cheaper variant if
architecture allows for it. Those that rely on smp_rmb() and smp_mb() can
further improve performance in a follow up step by implementing the two
under tools/arch/*/include/asm/barrier.h such that they don't have to
fallback to rmb() and mb() in tools/include/asm/barrier.h.

Switch perf to use ring_buffer_read_head() and ring_buffer_write_tail()
so it can make use of the optimizations. Later, we convert libbpf as
well to use the same helpers.

Side note [0]: the topic has been raised of whether one could simply use
the C11 gcc builtins [1] for the smp_load_acquire() and smp_store_release()
instead:

  __atomic_load_n(ptr, __ATOMIC_ACQUIRE);
  __atomic_store_n(ptr, val, __ATOMIC_RELEASE);

Kernel and (presumably) tooling shipped along with the kernel has a
minimum requirement of being able to build with gcc-4.6 and the latter
does not have C11 builtins. While generally the C11 memory models don't
align with the kernel's, the C11 load-acquire and store-release alone
/could/ suffice, however. Issue is that this is implementation dependent
on how the load-acquire and store-release is done by the compiler and
the mapping of supported compilers must align to be compatible with the
kernel's implementation, and thus needs to be verified/tracked on a
case by case basis whether they match (unless an architecture uses them
also from kernel side). The implementations for smp_load_acquire() and
smp_store_release() in this patch have been adapted from the kernel side
ones to have a concrete and compatible mapping in place.

  [0] http://patchwork.ozlabs.org/patch/985422/
  [1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoselftests/bpf: add missing executables to .gitignore
Anders Roxell [Fri, 19 Oct 2018 14:24:36 +0000 (16:24 +0200)]
selftests/bpf: add missing executables to .gitignore

Fixes: 371e4fcc9d96 ("selftests/bpf: cgroup local storage-based network counters")
Fixes: 370920c47b26 ("selftests/bpf: Test libbpf_{prog,attach}_type_by_name")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoMerge branch 'queue_stack_maps'
Alexei Starovoitov [Fri, 19 Oct 2018 20:24:31 +0000 (13:24 -0700)]
Merge branch 'queue_stack_maps'

Mauricio Vasquez says:

====================
In some applications this is needed have a pool of free elements, for
example the list of free L4 ports in a SNAT.  None of the current maps allow
to do it as it is not possible to get any element without having they key
it is associated to, even if it were possible, the lack of locking mecanishms in
eBPF would do it almost impossible to be implemented without data races.

This patchset implements two new kind of eBPF maps: queue and stack.
Those maps provide to eBPF programs the peek, push and pop operations, and for
userspace applications a new bpf_map_lookup_and_delete_elem() is added.

Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
v2 -> v3:
 - Remove "almost dead code" in syscall.c
 - Remove unnecessary copy_from_user in bpf_map_lookup_and_delete_elem
 - Rebase

v1 -> v2:
 - Put ARG_PTR_TO_UNINIT_MAP_VALUE logic into a separated patch
 - Fix missing __this_cpu_dec & preempt_enable calls in kernel/bpf/syscall.c

RFC v4 -> v1:
 - Remove roundup to power of 2 in memory allocation
 - Remove count and use a free slot to check if queue/stack is empty
 - Use if + assigment for wrapping indexes
 - Fix some minor style issues
 - Squash two patches together

RFC v3 -> RFC v4:
 - Revert renaming of kernel/bpf/stackmap.c
 - Remove restriction on value size
 - Remove len arguments from peek/pop helpers
 - Add new ARG_PTR_TO_UNINIT_MAP_VALUE

RFC v2 -> RFC v3:
 - Return elements by value instead that by reference
 - Implement queue/stack base on array and head + tail indexes
 - Rename stack trace related files to avoid confusion and conflicts

RFC v1 -> RFC v2:
 - Create two separate maps instead of single one + flags
 - Implement bpf_map_lookup_and_delete syscall
 - Support peek operation
 - Define replacement policy through flags in the update() method
 - Add eBPF side tests
====================

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoselftests/bpf: add test cases for queue and stack maps
Mauricio Vasquez B [Thu, 18 Oct 2018 13:16:41 +0000 (15:16 +0200)]
selftests/bpf: add test cases for queue and stack maps

test_maps:
Tests that queue/stack maps are behaving correctly even in corner cases

test_progs:
Tests new ebpf helpers

Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoSync uapi/bpf.h to tools/include
Mauricio Vasquez B [Thu, 18 Oct 2018 13:16:36 +0000 (15:16 +0200)]
Sync uapi/bpf.h to tools/include

Sync both files.

Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf: add MAP_LOOKUP_AND_DELETE_ELEM syscall
Mauricio Vasquez B [Thu, 18 Oct 2018 13:16:30 +0000 (15:16 +0200)]
bpf: add MAP_LOOKUP_AND_DELETE_ELEM syscall

The previous patch implemented a bpf queue/stack maps that
provided the peek/pop/push functions.  There is not a direct
relationship between those functions and the current maps
syscalls, hence a new MAP_LOOKUP_AND_DELETE_ELEM syscall is added,
this is mapped to the pop operation in the queue/stack maps
and it is still to implement in other kind of maps.

Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf: add queue and stack maps
Mauricio Vasquez B [Thu, 18 Oct 2018 13:16:25 +0000 (15:16 +0200)]
bpf: add queue and stack maps

Queue/stack maps implement a FIFO/LIFO data storage for ebpf programs.
These maps support peek, pop and push operations that are exposed to eBPF
programs through the new bpf_map[peek/pop/push] helpers.  Those operations
are exposed to userspace applications through the already existing
syscalls in the following way:

BPF_MAP_LOOKUP_ELEM            -> peek
BPF_MAP_LOOKUP_AND_DELETE_ELEM -> pop
BPF_MAP_UPDATE_ELEM            -> push

Queue/stack maps are implemented using a buffer, tail and head indexes,
hence BPF_F_NO_PREALLOC is not supported.

As opposite to other maps, queue and stack do not use RCU for protecting
maps values, the bpf_map[peek/pop] have a ARG_PTR_TO_UNINIT_MAP_VALUE
argument that is a pointer to a memory zone where to save the value of a
map.  Basically the same as ARG_PTR_TO_UNINIT_MEM, but the size has not
be passed as an extra argument.

Our main motivation for implementing queue/stack maps was to keep track
of a pool of elements, like network ports in a SNAT, however we forsee
other use cases, like for exampling saving last N kernel events in a map
and then analysing from userspace.

Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf/verifier: add ARG_PTR_TO_UNINIT_MAP_VALUE
Mauricio Vasquez B [Thu, 18 Oct 2018 13:16:20 +0000 (15:16 +0200)]
bpf/verifier: add ARG_PTR_TO_UNINIT_MAP_VALUE

ARG_PTR_TO_UNINIT_MAP_VALUE argument is a pointer to a memory zone
used to save the value of a map.  Basically the same as
ARG_PTR_TO_UNINIT_MEM, but the size has not be passed as an extra
argument.

This will be used in the following patch that implements some new
helpers that receive a pointer to be filled with a map value.

Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf/syscall: allow key to be null in map functions
Mauricio Vasquez B [Thu, 18 Oct 2018 13:16:14 +0000 (15:16 +0200)]
bpf/syscall: allow key to be null in map functions

This commit adds the required logic to allow key being NULL
in case the key_size of the map is 0.

A new __bpf_copy_key function helper only copies the key from
userpsace when key_size != 0, otherwise it enforces that key must be
null.

Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf: rename stack trace map operations
Mauricio Vasquez B [Thu, 18 Oct 2018 13:16:09 +0000 (15:16 +0200)]
bpf: rename stack trace map operations

In the following patches queue and stack maps (FIFO and LIFO
datastructures) will be implemented.  In order to avoid confusion and
a possible name clash rename stack_map_ops to stack_trace_map_ops

Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Fri, 19 Oct 2018 18:03:06 +0000 (11:03 -0700)]
Merge git://git./linux/kernel/git/davem/net

net/sched/cls_api.c has overlapping changes to a call to
nlmsg_parse(), one (from 'net') added rtm_tca_policy instead of NULL
to the 5th argument, and another (from 'net-next') added cb->extack
instead of NULL to the 6th argument.

net/ipv4/ipmr_base.c is a case of a bug fix in 'net' being done to
code which moved (to mr_table_dump)) in 'net-next'.  Thanks to David
Ahern for the heads up.

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoRevert "bond: take rcu lock in netpoll_send_skb_on_dev"
David S. Miller [Fri, 19 Oct 2018 17:45:08 +0000 (10:45 -0700)]
Revert "bond: take rcu lock in netpoll_send_skb_on_dev"

This reverts commit 6fe9487892b32cb1c8b8b0d552ed7222a527fe30.

It is causing more serious regressions than the RCU warning
it is fixing.

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agorocker: Drop pointless static qualifier
YueHaibing [Fri, 19 Oct 2018 12:02:59 +0000 (12:02 +0000)]
rocker: Drop pointless static qualifier

There is no need to have the 'struct rocker_desc_info *desc_info'
variable static since new value always be assigned before use it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'usb-4.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Greg Kroah-Hartman [Fri, 19 Oct 2018 17:25:44 +0000 (19:25 +0200)]
Merge tag 'usb-4.19-final' of git://git./linux/kernel/git/gregkh/usb

I wrote:
  "USB fixes for 4.19-final

   Here are a small number of last-minute USB driver fixes

   Included here are:
     - spectre fix for usb storage gadgets
     - xhci fixes
     - cdc-acm fixes
     - usbip fixes for reported problems

   All of these have been in linux-next with no reported issues."

* tag 'usb-4.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: gadget: storage: Fix Spectre v1 vulnerability
  USB: fix the usbfs flag sanitization for control transfers
  usb: xhci: pci: Enable Intel USB role mux on Apollo Lake platforms
  usb: roles: intel_xhci: Fix Unbalanced pm_runtime_enable
  cdc-acm: correct counting of UART states in serial state notification
  cdc-acm: do not reset notification buffer index upon urb unlinking
  cdc-acm: fix race between reset and control messaging
  usb: usbip: Fix BUG: KASAN: slab-out-of-bounds in vhci_hub_control()
  selftests: usbip: add wait after attach and before checking port status

5 years agoMerge tag 'for-linus-20181019' of git://git.kernel.dk/linux-block
Greg Kroah-Hartman [Fri, 19 Oct 2018 16:51:07 +0000 (18:51 +0200)]
Merge tag 'for-linus-20181019' of git://git.kernel.dk/linux-block

Jens writes:
  "Block fixes for 4.19-final

   Two small fixes that should go into this release."

* tag 'for-linus-20181019' of git://git.kernel.dk/linux-block:
  block: don't deal with discard limit in blkdev_issue_discard()
  nvme: remove ns sibling before clearing path

5 years agoRevert "netfilter: xt_quota: fix the behavior of xt_quota module"
Pablo Neira Ayuso [Fri, 19 Oct 2018 09:48:24 +0000 (11:48 +0200)]
Revert "netfilter: xt_quota: fix the behavior of xt_quota module"

This reverts commit e9837e55b0200da544a095a1fca36efd7fd3ba30.

When talking to Maze and Chenbo, we agreed to keep this back by now
due to problems in the ruleset listing path with 32-bit arches.

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
5 years agonetfilter: nfnetlink_log: remove empty nfnetlink_log.h header file
Taehee Yoo [Thu, 18 Oct 2018 13:29:59 +0000 (22:29 +0900)]
netfilter: nfnetlink_log: remove empty nfnetlink_log.h header file

/include/net/netfilter/nfnetlink_log.h file is empty.
so that it can be removed.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
5 years agonetfilter: remove two unused variables.
Weongyo Jeong [Wed, 17 Oct 2018 12:45:17 +0000 (21:45 +0900)]
netfilter: remove two unused variables.

nft_dup_netdev_ingress_ops and nft_fwd_netdev_ingress_ops variables are
no longer used at the code.

Signed-off-by: Weongyo Jeong <weongyo.linux@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
5 years agonetfilter: nf_flow_table: remove unnecessary parameter of nf_flow_table_cleanup()
Taehee Yoo [Thu, 11 Oct 2018 18:01:54 +0000 (03:01 +0900)]
netfilter: nf_flow_table: remove unnecessary parameter of nf_flow_table_cleanup()

parameter net of nf_flow_table_cleanup() is not used.
So that it can be removed.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
5 years agodrm/sun4i: Fix an ulong overflow in the dotclock driver
Boris Brezillon [Thu, 18 Oct 2018 10:02:50 +0000 (12:02 +0200)]
drm/sun4i: Fix an ulong overflow in the dotclock driver

The calculated ideal rate can easily overflow an unsigned long, thus
making the best div selection buggy as soon as no ideal match is found
before the overflow occurs.

Fixes: 4731a72df273 ("drm/sun4i: request exact rates to our parents")
Cc: <stable@vger.kernel.org>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Acked-by: Maxime Ripard <maxime.ripard@bootlin.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181018100250.12565-1-boris.brezillon@bootlin.com
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Greg Kroah-Hartman [Fri, 19 Oct 2018 07:16:20 +0000 (09:16 +0200)]
Merge git://git./linux/kernel/git/davem/net

David writes:
  "Networking

   1) Fix gro_cells leak in xfrm layer, from Li RongQing.

   2) BPF selftests change RLIMIT_MEMLOCK blindly, don't do that.  From
      Eric Dumazet.

   3) AF_XDP calls synchronize_net() under RCU lock, fix from Björn
      Töpel.

   4) Out of bounds packet access in _decode_session6(), from Alexei
      Starovoitov.

   5) Several ethtool bugs, where we copy a struct into the kernel twice
      and our validations of the values in the first copy can be
      invalidated by the second copy due to asynchronous updates to the
      memory by the user.  From Wenwen Wang.

   6) Missing netlink attribute validation in cls_api, from Davide
      Caratti.

   7) LLC SAP sockets neet to be SOCK_RCU FREE, from Cong Wang.

   8) rxrpc operates on wrong kvec, from Yue Haibing.

   9) A regression was introduced by the disassosciation of route
      neighbour references in rt6_probe(), causing probe for
      neighbourless routes to not be properly rate limited.  Fix from
      Sabrina Dubroca.

   10) Unsafe RCU locking in tipc, from Tung Nguyen.

   11) Use after free in inet6_mc_check(), from Eric Dumazet.

   12) PMTU from icmp packets should update the SCTP transport pathmtu,
       from Xin Long.

   13) Missing peer put on error in rxrpc, from David Howells.

   14) Fix pedit in nfp driver, from Pieter Jansen van Vuuren.

   15) Fix overflowing shift statement in qla3xxx driver, from Nathan
       Chancellor.

   16) Fix Spectre v1 in ptp code, from Gustavo A. R. Silva.

   17) udp6_unicast_rcv_skb() interprets udpv6_queue_rcv_skb() return
       value in an inverted manner, fix from Paolo Abeni.

   18) Fix missed unresolved entries in ipmr dumps, from Nikolay
       Aleksandrov.

   19) Fix NAPI handling under high load, we can completely miss events
       when NAPI has to loop more than one time in a cycle.  From Heiner
       Kallweit."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (49 commits)
  ip6_tunnel: Fix encapsulation layout
  tipc: fix info leak from kernel tipc_event
  net: socket: fix a missing-check bug
  net: sched: Fix for duplicate class dump
  r8169: fix NAPI handling under high load
  net: ipmr: fix unresolved entry dumps
  net: mscc: ocelot: Fix comment in ocelot_vlant_wait_for_completion()
  sctp: fix the data size calculation in sctp_data_size
  virtio_net: avoid using netif_tx_disable() for serializing tx routine
  udp6: fix encap return code for resubmitting
  mlxsw: core: Fix use-after-free when flashing firmware during init
  sctp: not free the new asoc when sctp_wait_for_connect returns err
  sctp: fix race on sctp_id2asoc
  r8169: re-enable MSI-X on RTL8168g
  net: bpfilter: use get_pid_task instead of pid_task
  ptp: fix Spectre v1 vulnerability
  net: qla3xxx: Remove overflowing shift statement
  geneve, vxlan: Don't set exceptions if skb->len < mtu
  geneve, vxlan: Don't check skb_dst() twice
  sctp: get pr_assoc and pr_stream all status with SCTP_PR_SCTP_ALL instead
  ...

5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Greg Kroah-Hartman [Fri, 19 Oct 2018 07:15:12 +0000 (09:15 +0200)]
Merge git://git./linux/kernel/git/davem/sparc

David writes:
  "Sparc fixes:

   The main bit here is fixing how fallback system calls are handled in
   the sparc vDSO.

   Unfortunately, I fat fingered the commit and some perf debugging
   hacks slipped into the vDSO fix, which I revert in the very next
   commit."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Revert unintended perf changes.
  sparc: vDSO: Silence an uninitialized variable warning
  sparc: Fix syscall fallback bugs in VDSO.

5 years agoMerge tag 'drm-fixes-2018-10-19' of git://anongit.freedesktop.org/drm/drm
Greg Kroah-Hartman [Fri, 19 Oct 2018 06:31:22 +0000 (08:31 +0200)]
Merge tag 'drm-fixes-2018-10-19' of git://anongit.freedesktop.org/drm/drm

Dave writes:
  "drm fixes for 4.19 final

   Just a last set of misc core fixes for final.

   4 fixes, one use after free, one fb integration fix, one EDID fix,
   and one laptop panel quirk,"

* tag 'drm-fixes-2018-10-19' of git://anongit.freedesktop.org/drm/drm:
  drm/edid: VSDB yCBCr420 Deep Color mode bit definitions
  drm: fix use of freed memory in drm_mode_setcrtc
  drm: fb-helper: Reject all pixel format changing requests
  drm/edid: Add 6 bpc quirk for BOE panel in HP Pavilion 15-n233sl

5 years agoMerge tag 'for-gkh' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Greg Kroah-Hartman [Fri, 19 Oct 2018 06:30:35 +0000 (08:30 +0200)]
Merge tag 'for-gkh' of git://git./linux/kernel/git/rdma/rdma

Doug writes:
  "Really final for-rc pull request for 4.19

   Ok, so last week I thought we had sent our final pull request for
   4.19.  Well, wouldn't ya know someone went and found a couple Spectre
   v1 fixes were needed :-/.  So, a couple *very* small specter patches
   for this (hopefully) final -rc week."

* tag 'for-gkh' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/ucma: Fix Spectre v1 vulnerability
  IB/ucm: Fix Spectre v1 vulnerability

5 years agox86/swiotlb: Enable swiotlb for > 4GiG RAM on 32-bit kernels
Christoph Hellwig [Sun, 14 Oct 2018 07:52:08 +0000 (09:52 +0200)]
x86/swiotlb: Enable swiotlb for > 4GiG RAM on 32-bit kernels

We already build the swiotlb code for 32-bit kernels with PAE support,
but the code to actually use swiotlb has only been enabled for 64-bit
kernels for an unknown reason.

Before Linux v4.18 we paper over this fact because the networking code,
the SCSI layer and some random block drivers implemented their own
bounce buffering scheme.

[ mingo: Changelog fixes. ]

Fixes: 21e07dba9fb1 ("scsi: reduce use of block bounce buffers")
Fixes: ab74cfebafa3 ("net: remove the PCI_DMA_BUS_IS_PHYS check in illegal_highdma")
Reported-by: Matthew Whitehead <tedheadster@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Matthew Whitehead <tedheadster@gmail.com>
Cc: konrad.wilk@oracle.com
Cc: iommu@lists.linux-foundation.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20181014075208.2715-1-hch@lst.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
5 years agoMerge tag 'drm-misc-fixes-2018-10-18' of git://anongit.freedesktop.org/drm/drm-misc...
Dave Airlie [Fri, 19 Oct 2018 03:51:55 +0000 (13:51 +1000)]
Merge tag 'drm-misc-fixes-2018-10-18' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

drm-misc-fixes for v4.19:
- Fix use of freed memory in drm_mode_setcrtc.
- Reject pixel format changing requests in fb helper.
- Add 6 bpc quirk for HP Pavilion 15-n233sl
- Fix VSDB yCBCr420 Deep Color mode bit definitions

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/647fe5d0-4ec5-57cc-9f23-a4836b29e278@linux.intel.com
5 years agoqed: fix spelling mistake "transcevier" -> "transceiver"
Colin Ian King [Thu, 18 Oct 2018 21:47:10 +0000 (22:47 +0100)]
qed: fix spelling mistake "transcevier" -> "transceiver"

Trivial fix to spelling mistake in DP_INFO message.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'mlx5-updates-2018-10-18' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Fri, 19 Oct 2018 00:01:12 +0000 (17:01 -0700)]
Merge tag 'mlx5-updates-2018-10-18' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2018-10-18

This series provides misc updates to mlx5 core and netdevice driver.

1) From Tariq Toukan: Refactor fragmented buffer struct fields and init flow.

2) From Vlad Buslov, Flow counters cache improvements and fixes follow up.
as a follow up work for the previous series of the mlx5 flow counters,
Vlad provides two fixes:
  2.1) Take fs_counters dellist before addlist
Fixes: 6e5e22839136 ("net/mlx5: Add new list to store deleted flow counters")
  2.2) Remove counter from idr after removing it from list
Fixes: 12d6066c3b29 ("net/mlx5: Add flow counters idr")
From Shay Agroskin,
3) Add FEC set/get FW commands and FEC ethtool callbacks support
4) Add new ethtool statistics to cover errors on rx, such as FEC errors.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoip6_tunnel: Fix encapsulation layout
Stefano Brivio [Thu, 18 Oct 2018 19:25:07 +0000 (21:25 +0200)]
ip6_tunnel: Fix encapsulation layout

Commit 058214a4d1df ("ip6_tun: Add infrastructure for doing
encapsulation") added the ip6_tnl_encap() call in ip6_tnl_xmit(), before
the call to ipv6_push_frag_opts() to append the IPv6 Tunnel Encapsulation
Limit option (option 4, RFC 2473, par. 5.1) to the outer IPv6 header.

As long as the option didn't actually end up in generated packets, this
wasn't an issue. Then commit 89a23c8b528b ("ip6_tunnel: Fix missing tunnel
encapsulation limit option") fixed sending of this option, and the
resulting layout, e.g. for FoU, is:

.-------------------.------------.----------.-------------------.----- - -
| Outer IPv6 Header | UDP header | Option 4 | Inner IPv6 Header | Payload
'-------------------'------------'----------'-------------------'----- - -

Needless to say, FoU and GUE (at least) won't work over IPv6. The option
is appended by default, and I couldn't find a way to disable it with the
current iproute2.

Turn this into a more reasonable:

.-------------------.----------.------------.-------------------.----- - -
| Outer IPv6 Header | Option 4 | UDP header | Inner IPv6 Header | Payload
'-------------------'----------'------------'-------------------'----- - -

With this, and with 84dad55951b0 ("udp6: fix encap return code for
resubmitting"), FoU and GUE work again over IPv6.

Fixes: 058214a4d1df ("ip6_tun: Add infrastructure for doing encapsulation")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotcp: fix TCP_REPAIR xmit queue setup
Eric Dumazet [Thu, 18 Oct 2018 16:12:19 +0000 (09:12 -0700)]
tcp: fix TCP_REPAIR xmit queue setup

Andrey reported the following warning triggered while running CRIU tests:

tcp_clean_rtx_queue()
...
last_ackt = tcp_skb_timestamp_us(skb);
WARN_ON_ONCE(last_ackt == 0);

This is caused by 5f6188a8003d ("tcp: do not change tcp_wstamp_ns
in tcp_mstamp_refresh"), as we end up having skbs in retransmit queue
with a zero skb->skb_mstamp_ns field.

We could fix this bug in different ways, like making sure
tp->tcp_wstamp_ns is not zero at socket creation, but as Neal pointed
out, we also do not want that pacing status of a repaired socket
could push tp->tcp_wstamp_ns far ahead in the future.

So we prefer changing tcp_write_xmit() to not call tcp_update_skb_after_send()
and instead do what is requested by TCP_REPAIR logic.

Fixes: 5f6188a8003d ("tcp: do not change tcp_wstamp_ns in tcp_mstamp_refresh")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotipc: fix info leak from kernel tipc_event
Jon Maloy [Thu, 18 Oct 2018 15:38:29 +0000 (17:38 +0200)]
tipc: fix info leak from kernel tipc_event

We initialize a struct tipc_event allocated on the kernel stack to
zero to avert info leak to user space.

Reported-by: syzbot+057458894bc8cada4dee@syzkaller.appspotmail.com
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet-next/hinic: add checksum offload and TSO support
Zhao Chen [Thu, 18 Oct 2018 15:02:51 +0000 (15:02 +0000)]
net-next/hinic: add checksum offload and TSO support

This patch adds checksum offload and TSO support for the HiNIC
driver. Perfomance test (Iperf) shows more than 100% improvement
in TCP streams.

Signed-off-by: Zhao Chen <zhaochen6@huawei.com>
Signed-off-by: Xue Chaojing <xuechaojing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: socket: fix a missing-check bug
Wenwen Wang [Thu, 18 Oct 2018 14:36:46 +0000 (09:36 -0500)]
net: socket: fix a missing-check bug

In ethtool_ioctl(), the ioctl command 'ethcmd' is checked through a switch
statement to see whether it is necessary to pre-process the ethtool
structure, because, as mentioned in the comment, the structure
ethtool_rxnfc is defined with padding. If yes, a user-space buffer 'rxnfc'
is allocated through compat_alloc_user_space(). One thing to note here is
that, if 'ethcmd' is ETHTOOL_GRXCLSRLALL, the size of the buffer 'rxnfc' is
partially determined by 'rule_cnt', which is actually acquired from the
user-space buffer 'compat_rxnfc', i.e., 'compat_rxnfc->rule_cnt', through
get_user(). After 'rxnfc' is allocated, the data in the original user-space
buffer 'compat_rxnfc' is then copied to 'rxnfc' through copy_in_user(),
including the 'rule_cnt' field. However, after this copy, no check is
re-enforced on 'rxnfc->rule_cnt'. So it is possible that a malicious user
race to change the value in the 'compat_rxnfc->rule_cnt' between these two
copies. Through this way, the attacker can bypass the previous check on
'rule_cnt' and inject malicious data. This can cause undefined behavior of
the kernel and introduce potential security risk.

This patch avoids the above issue via copying the value acquired by
get_user() to 'rxnfc->rule_cn', if 'ethcmd' is ETHTOOL_GRXCLSRLALL.

Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agocxgb4: fix the error path of cxgb4_uld_register()
Ganesh Goudar [Thu, 18 Oct 2018 14:04:19 +0000 (19:34 +0530)]
cxgb4: fix the error path of cxgb4_uld_register()

On multi adapter setup if the uld registration fails even on
one adapter, the allocated resources for the uld on all the
adapters are freed, rendering the functioning adapters unusable.

This commit fixes the issue by freeing the allocated resources
only for the failed adapter.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: Fix for duplicate class dump
Phil Sutter [Thu, 18 Oct 2018 08:34:26 +0000 (10:34 +0200)]
net: sched: Fix for duplicate class dump

When dumping classes by parent, kernel would return classes twice:

| # tc qdisc add dev lo root prio
| # tc class show dev lo
| class prio 8001:1 parent 8001:
| class prio 8001:2 parent 8001:
| class prio 8001:3 parent 8001:
| # tc class show dev lo parent 8001:
| class prio 8001:1 parent 8001:
| class prio 8001:2 parent 8001:
| class prio 8001:3 parent 8001:
| class prio 8001:1 parent 8001:
| class prio 8001:2 parent 8001:
| class prio 8001:3 parent 8001:

This comes from qdisc_match_from_root() potentially returning the root
qdisc itself if its handle matched. Though in that case, root's classes
were already dumped a few lines above.

Fixes: cb395b2010879 ("net: sched: optimize class dumps")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>