summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-05-22qed: Fix mask for physical address in ILT entryShahed Shaikh
ILT entry requires 12 bit right shifted physical address. Existing mask for ILT entry of physical address i.e. ILT_ENTRY_PHY_ADDR_MASK is not sufficient to handle 64bit address because upper 8 bits of 64 bit address were getting masked which resulted in completer abort error on PCIe bus due to invalid address. Fix that mask to handle 64bit physical address. Fixes: fe56b9e6a8d9 ("qed: Add module with basic common support") Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22ipmr: properly check rhltable_init() return valueEric Dumazet
commit 8fb472c09b9d ("ipmr: improve hash scalability") added a call to rhltable_init() without checking its return value. This problem was then later copied to IPv6 and factorized in commit 0bbbf0e7d0e7 ("ipmr, ip6mr: Unite creation of new mr_table") kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 31552 Comm: syz-executor7 Not tainted 4.17.0-rc5+ #60 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:rht_key_hashfn include/linux/rhashtable.h:277 [inline] RIP: 0010:__rhashtable_lookup include/linux/rhashtable.h:630 [inline] RIP: 0010:rhltable_lookup include/linux/rhashtable.h:716 [inline] RIP: 0010:mr_mfc_find_parent+0x2ad/0xbb0 net/ipv4/ipmr_base.c:63 RSP: 0018:ffff8801826aef70 EFLAGS: 00010203 RAX: 0000000000000001 RBX: 0000000000000001 RCX: ffffc90001ea0000 RDX: 0000000000000079 RSI: ffffffff8661e859 RDI: 000000000000000c RBP: ffff8801826af1c0 R08: ffff8801b2212000 R09: ffffed003b5e46c2 R10: ffffed003b5e46c2 R11: ffff8801daf23613 R12: dffffc0000000000 R13: ffff8801826af198 R14: ffff8801cf8225c0 R15: ffff8801826af658 FS: 00007ff7fa732700(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000003ffffff9c CR3: 00000001b0210000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ip6mr_cache_find_parent net/ipv6/ip6mr.c:981 [inline] ip6mr_mfc_delete+0x1fe/0x6b0 net/ipv6/ip6mr.c:1221 ip6_mroute_setsockopt+0x15c6/0x1d70 net/ipv6/ip6mr.c:1698 do_ipv6_setsockopt.isra.9+0x422/0x4660 net/ipv6/ipv6_sockglue.c:163 ipv6_setsockopt+0xbd/0x170 net/ipv6/ipv6_sockglue.c:922 rawv6_setsockopt+0x59/0x140 net/ipv6/raw.c:1060 sock_common_setsockopt+0x9a/0xe0 net/core/sock.c:3039 __sys_setsockopt+0x1bd/0x390 net/socket.c:1903 __do_sys_setsockopt net/socket.c:1914 [inline] __se_sys_setsockopt net/socket.c:1911 [inline] __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1911 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: 8fb472c09b9d ("ipmr: improve hash scalability") Fixes: 0bbbf0e7d0e7 ("ipmr, ip6mr: Unite creation of new mr_table") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Cc: Yuval Mintz <yuvalm@mellanox.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22Merge branch 'net-ipv6-Fix-route-append-and-replace-use-cases'David S. Miller
David Ahern says: ==================== net/ipv6: Fix route append and replace use cases This patch set fixes a few append and replace uses cases for IPv6 and adds test cases that codifies the expectations of how append and replace are expected to work. In paricular it allows a multipath route to have a dev-only nexthop, something Thomas tried to accomplish with commit edd7ceb78296 ("ipv6: Allow non-gateway ECMP for IPv6") which had to be reverted because of breakage, and to replace an existing FIB entry with a reject route. There are a number of inconsistent and surprising aspects to the Linux API for adding, deleting, replacing and changing FIB entries. For example, with IPv4 NLM_F_APPEND means insert the route after any existing entries with the same key (prefix + priority + TOS for IPv4) and NLM_F_CREATE without the append flag inserts the new route before any existing entries. IPv6 on the other hand attempts to guess whether a new route should be appended to an existing one, possibly creating a multipath route, or to add a new entry after any existing ones. This applies to both the 'append' (NLM_F_CREATE + NLM_F_APPEND) and 'prepend' (NLM_F_CREATE only) cases meaning for IPv6 the NLM_F_APPEND is basically ignored. This guessing whether the route should be added to a multipath route (gateway routes) or inserted after existing entries (non-gateway based routes) means a multipath route can not have a dev only nexthop (potentially required in some cases - tunnels or VRF route leaking for example) and route 'replace' is a bit adhoc treating gateway based routes and dev-only / reject routes differently. This has led to frustration with developers working on routing suites such as FRR where workarounds such as delete and add are used instead of replace. After this patch set there are 2 differences between IPv4 and IPv6: 1. 'ip ro prepend' = NLM_F_CREATE only IPv4 adds the new route before any existing ones IPv6 adds new route after any existing ones 2. 'ip ro append' = NLM_F_CREATE|NLM_F_APPEND IPv4 adds the new route after any existing ones IPv6 adds the nexthop to existing routes converting to multipath For the former, there are cases where we want same prefix routes added after existing ones (e.g., multicast, prefix routes for macvlan when used for virtual router redundancy). Requiring the APPEND flag to add a new route to an existing one helps here but is a slight change in behavior since prepend with gateway routes now create a separate entry. For the latter IPv6 behavior is preferred - appending a route for the same prefix and metric to make a multipath route, so really IPv4 not allowing an existing route to be updated is the limiter. This will be fixed when nexthops become separate objects - a future patch set. Thank you to Thomas and Ido for testing earlier versions of this set, and to Ido for providing an update to the mlxsw driver. Changes since RFC - cleanup wording in test script; add comments about expected failures and why ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22selftests: fib_tests: Add ipv4 route add append replace testsDavid Ahern
Add IPv4 route tests covering add, append and replace permutations. Assumes the ability to add a basic single path route works; this is required for example when adding an address to an interface. $ fib_tests.sh -t ipv4_rt IPv4 route add / append tests TEST: Attempt to add duplicate route - gw [ OK ] TEST: Attempt to add duplicate route - dev only [ OK ] TEST: Attempt to add duplicate route - reject route [ OK ] TEST: Add new nexthop for existing prefix [ OK ] TEST: Append nexthop to existing route - gw [ OK ] TEST: Append nexthop to existing route - dev only [ OK ] TEST: Append nexthop to existing route - reject route [ OK ] TEST: Append nexthop to existing reject route - gw [ OK ] TEST: Append nexthop to existing reject route - dev only [ OK ] TEST: add multipath route [ OK ] TEST: Attempt to add duplicate multipath route [ OK ] TEST: Route add with different metrics [ OK ] TEST: Route delete with metric [ OK ] IPv4 route replace tests TEST: Single path with single path [ OK ] TEST: Single path with multipath [ OK ] TEST: Single path with reject route [ OK ] TEST: Single path with single path via multipath attribute [ OK ] TEST: Invalid nexthop [ OK ] TEST: Single path - replace of non-existent route [ OK ] TEST: Multipath with multipath [ OK ] TEST: Multipath with single path [ OK ] TEST: Multipath with single path via multipath attribute [ OK ] TEST: Multipath with reject route [ OK ] TEST: Multipath - invalid first nexthop [ OK ] TEST: Multipath - invalid second nexthop [ OK ] TEST: Multipath - replace of non-existent route [ OK ] Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22selftests: fib_tests: Add ipv6 route add append replace testsDavid Ahern
Add IPv6 route tests covering add, append and replace permutations. Assumes the ability to add a basic single path route works; this is required for example when adding an address to an interface. $ fib_tests.sh -t ipv6_rt IPv6 route add / append tests TEST: Attempt to add duplicate route - gw [ OK ] TEST: Attempt to add duplicate route - dev only [ OK ] TEST: Attempt to add duplicate route - reject route [ OK ] TEST: Add new route for existing prefix (w/o NLM_F_EXCL) [ OK ] TEST: Append nexthop to existing route - gw [ OK ] TEST: Append nexthop to existing route - dev only [ OK ] TEST: Append nexthop to existing route - reject route [ OK ] TEST: Append nexthop to existing reject route - gw [ OK ] TEST: Append nexthop to existing reject route - dev only [ OK ] TEST: Add multipath route [ OK ] TEST: Attempt to add duplicate multipath route [ OK ] TEST: Route add with different metrics [ OK ] TEST: Route delete with metric [ OK ] IPv6 route replace tests TEST: Single path with single path [ OK ] TEST: Single path with multipath [ OK ] TEST: Single path with reject route [ OK ] TEST: Single path with single path via multipath attribute [ OK ] TEST: Invalid nexthop [ OK ] TEST: Single path - replace of non-existent route [ OK ] TEST: Multipath with multipath [ OK ] TEST: Multipath with single path [ OK ] TEST: Multipath with single path via multipath attribute [ OK ] TEST: Multipath with reject route [ OK ] TEST: Multipath - invalid first nexthop [ OK ] TEST: Multipath - invalid second nexthop [ OK ] TEST: Multipath - replace of non-existent route [ OK ] Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22selftests: fib_tests: Add option to pause after each testDavid Ahern
Add option to pause after each test before cleanup is done. Allows user to do manual inspection or more ad-hoc testing after each test with the setup in tact. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22selftests: fib_tests: Add command line optionsDavid Ahern
Add command line options for controlling pause on fail, controlling specific tests to run and verbose mode rather than relying on environment variables. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22selftests: fib_tests: Add success-fail countsDavid Ahern
As more tests are added, it is convenient to have a tally at the end. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22net/ipv6: Simplify route replace and appending into multipath routeDavid Ahern
Bring consistency to ipv6 route replace and append semantics. Remove rt6_qualify_for_ecmp which is just guess work. It fails in 2 cases: 1. can not replace a route with a reject route. Existing code appends a new route instead of replacing the existing one. 2. can not have a multipath route where a leg uses a dev only nexthop Existing use cases affected by this change: 1. adding a route with existing prefix and metric using NLM_F_CREATE without NLM_F_APPEND or NLM_F_EXCL (ie., what iproute2 calls 'prepend'). Existing code auto-determines that the new nexthop can be appended to an existing route to create a multipath route. This change breaks that by requiring the APPEND flag for the new route to be added to an existing one. Instead the prepend just adds another route entry. 2. route replace. Existing code replaces first matching multipath route if new route is multipath capable and fallback to first matching non-ECMP route (reject or dev only route) in case one isn't available. New behavior replaces first matching route. (Thanks to Ido for spotting this one) Note: Newer iproute2 is needed to display multipath routes with a dev-only nexthop. This is due to a bug in iproute2 and parsing nexthops. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22mlxsw: spectrum_router: Add support for route appendDavid Ahern
Handle append for gateway based routes. Dev-only multipath routes will be handled by a follow on patch. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22dccp: don't free ccid2_hc_tx_sock struct in dccp_disconnect()Alexey Kodanev
Syzbot reported the use-after-free in timer_is_static_object() [1]. This can happen because the structure for the rto timer (ccid2_hc_tx_sock) is removed in dccp_disconnect(), and ccid2_hc_tx_rto_expire() can be called after that. The report [1] is similar to the one in commit 120e9dabaf55 ("dccp: defer ccid_hc_tx_delete() at dismantle time"). And the fix is the same, delay freeing ccid2_hc_tx_sock structure, so that it is freed in dccp_sk_destruct(). [1] ================================================================== BUG: KASAN: use-after-free in timer_is_static_object+0x80/0x90 kernel/time/timer.c:607 Read of size 8 at addr ffff8801bebb5118 by task syz-executor2/25299 CPU: 1 PID: 25299 Comm: syz-executor2 Not tainted 4.17.0-rc5+ #54 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 print_address_description+0x6c/0x20b mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 timer_is_static_object+0x80/0x90 kernel/time/timer.c:607 debug_object_activate+0x2d9/0x670 lib/debugobjects.c:508 debug_timer_activate kernel/time/timer.c:709 [inline] debug_activate kernel/time/timer.c:764 [inline] __mod_timer kernel/time/timer.c:1041 [inline] mod_timer+0x4d3/0x13b0 kernel/time/timer.c:1102 sk_reset_timer+0x22/0x60 net/core/sock.c:2742 ccid2_hc_tx_rto_expire+0x587/0x680 net/dccp/ccids/ccid2.c:147 call_timer_fn+0x230/0x940 kernel/time/timer.c:1326 expire_timers kernel/time/timer.c:1363 [inline] __run_timers+0x79e/0xc50 kernel/time/timer.c:1666 run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692 __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285 invoke_softirq kernel/softirq.c:365 [inline] irq_exit+0x1d1/0x200 kernel/softirq.c:405 exiting_irq arch/x86/include/asm/apic.h:525 [inline] smp_apic_timer_interrupt+0x17e/0x710 arch/x86/kernel/apic/apic.c:1052 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:863 </IRQ> ... Allocated by task 25374: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490 kmem_cache_alloc+0x12e/0x760 mm/slab.c:3554 ccid_new+0x25b/0x3e0 net/dccp/ccid.c:151 dccp_hdlr_ccid+0x27/0x150 net/dccp/feat.c:44 __dccp_feat_activate+0x184/0x270 net/dccp/feat.c:344 dccp_feat_activate_values+0x3a7/0x819 net/dccp/feat.c:1538 dccp_create_openreq_child+0x472/0x610 net/dccp/minisocks.c:128 dccp_v4_request_recv_sock+0x12c/0xca0 net/dccp/ipv4.c:408 dccp_v6_request_recv_sock+0x125d/0x1f10 net/dccp/ipv6.c:415 dccp_check_req+0x455/0x6a0 net/dccp/minisocks.c:197 dccp_v4_rcv+0x7b8/0x1f3f net/dccp/ipv4.c:841 ip_local_deliver_finish+0x2e3/0xd80 net/ipv4/ip_input.c:215 NF_HOOK include/linux/netfilter.h:288 [inline] ip_local_deliver+0x1e1/0x720 net/ipv4/ip_input.c:256 dst_input include/net/dst.h:450 [inline] ip_rcv_finish+0x81b/0x2200 net/ipv4/ip_input.c:396 NF_HOOK include/linux/netfilter.h:288 [inline] ip_rcv+0xb70/0x143d net/ipv4/ip_input.c:492 __netif_receive_skb_core+0x26f5/0x3630 net/core/dev.c:4592 __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:4657 process_backlog+0x219/0x760 net/core/dev.c:5337 napi_poll net/core/dev.c:5735 [inline] net_rx_action+0x7b7/0x1930 net/core/dev.c:5801 __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285 Freed by task 25374: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kmem_cache_free+0x86/0x2d0 mm/slab.c:3756 ccid_hc_tx_delete+0xc3/0x100 net/dccp/ccid.c:190 dccp_disconnect+0x130/0xc66 net/dccp/proto.c:286 dccp_close+0x3bc/0xe60 net/dccp/proto.c:1045 inet_release+0x104/0x1f0 net/ipv4/af_inet.c:427 inet6_release+0x50/0x70 net/ipv6/af_inet6.c:460 sock_release+0x96/0x1b0 net/socket.c:594 sock_close+0x16/0x20 net/socket.c:1149 __fput+0x34d/0x890 fs/file_table.c:209 ____fput+0x15/0x20 fs/file_table.c:243 task_work_run+0x1e4/0x290 kernel/task_work.c:113 tracehook_notify_resume include/linux/tracehook.h:191 [inline] exit_to_usermode_loop+0x2bd/0x310 arch/x86/entry/common.c:166 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline] syscall_return_slowpath arch/x86/entry/common.c:265 [inline] do_syscall_64+0x6ac/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff8801bebb4cc0 which belongs to the cache ccid2_hc_tx_sock of size 1240 The buggy address is located 1112 bytes inside of 1240-byte region [ffff8801bebb4cc0, ffff8801bebb5198) The buggy address belongs to the page: page:ffffea0006faed00 count:1 mapcount:0 mapping:ffff8801bebb41c0 index:0xffff8801bebb5240 compound_mapcount: 0 flags: 0x2fffc0000008100(slab|head) raw: 02fffc0000008100 ffff8801bebb41c0 ffff8801bebb5240 0000000100000003 raw: ffff8801cdba3138 ffffea0007634120 ffff8801cdbaab40 0000000000000000 page dumped because: kasan: bad access detected ... ================================================================== Reported-by: syzbot+5d47e9ec91a6f15dbd6f@syzkaller.appspotmail.com Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22isdn: eicon: fix a missing-check bugWenwen Wang
In divasmain.c, the function divas_write() firstly invokes the function diva_xdi_open_adapter() to open the adapter that matches with the adapter number provided by the user, and then invokes the function diva_xdi_write() to perform the write operation using the matched adapter. The two functions diva_xdi_open_adapter() and diva_xdi_write() are located in diva.c. In diva_xdi_open_adapter(), the user command is copied to the object 'msg' from the userspace pointer 'src' through the function pointer 'cp_fn', which eventually calls copy_from_user() to do the copy. Then, the adapter number 'msg.adapter' is used to find out a matched adapter from the 'adapter_queue'. A matched adapter will be returned if it is found. Otherwise, NULL is returned to indicate the failure of the verification on the adapter number. As mentioned above, if a matched adapter is returned, the function diva_xdi_write() is invoked to perform the write operation. In this function, the user command is copied once again from the userspace pointer 'src', which is the same as the 'src' pointer in diva_xdi_open_adapter() as both of them are from the 'buf' pointer in divas_write(). Similarly, the copy is achieved through the function pointer 'cp_fn', which finally calls copy_from_user(). After the successful copy, the corresponding command processing handler of the matched adapter is invoked to perform the write operation. It is obvious that there are two copies here from userspace, one is in diva_xdi_open_adapter(), and one is in diva_xdi_write(). Plus, both of these two copies share the same source userspace pointer, i.e., the 'buf' pointer in divas_write(). Given that a malicious userspace process can race to change the content pointed by the 'buf' pointer, this can pose potential security issues. For example, in the first copy, the user provides a valid adapter number to pass the verification process and a valid adapter can be found. Then the user can modify the adapter number to an invalid number. This way, the user can bypass the verification process of the adapter number and inject inconsistent data. This patch reuses the data copied in diva_xdi_open_adapter() and passes it to diva_xdi_write(). This way, the above issues can be avoided. Signed-off-by: Wenwen Wang <wang6495@umn.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22net: fec: Add a SPDX identifierFabio Estevam
Currently there is no license information in the header of this file. The MODULE_LICENSE field contains ("GPL"), which means GNU Public License v2 or later, so add a corresponding SPDX license identifier. Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Acked-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22net: fec: ptp: Switch to SPDX identifierFabio Estevam
Adopt the SPDX license identifier headers to ease license compliance management. Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Acked-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22sctp: fix the issue that flags are ignored when using kernel_connectXin Long
Now sctp uses inet_dgram_connect as its proto_ops .connect, and the flags param can't be passed into its proto .connect where this flags is really needed. sctp works around it by getting flags from socket file in __sctp_connect. It works for connecting from userspace, as inherently the user sock has socket file and it passes f_flags as the flags param into the proto_ops .connect. However, the sock created by sock_create_kern doesn't have a socket file, and it passes the flags (like O_NONBLOCK) by using the flags param in kernel_connect, which calls proto_ops .connect later. So to fix it, this patch defines a new proto_ops .connect for sctp, sctp_inet_connect, which calls __sctp_connect() directly with this flags param. After this, the sctp's proto .connect can be removed. Note that sctp_inet_connect doesn't need to do some checks that are not needed for sctp, which makes thing better than with inet_dgram_connect. Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reviewed-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-05-22arm64: fault: Don't leak data in ESR context for user fault on kernel VAPeter Maydell
If userspace faults on a kernel address, handing them the raw ESR value on the sigframe as part of the delivered signal can leak data useful to attackers who are using information about the underlying hardware fault type (e.g. translation vs permission) as a mechanism to defeat KASLR. However there are also legitimate uses for the information provided in the ESR -- notably the GCC and LLVM sanitizers use this to report whether wild pointer accesses by the application are reads or writes (since a wild write is a more serious bug than a wild read), so we don't want to drop the ESR information entirely. For faulting addresses in the kernel, sanitize the ESR. We choose to present userspace with the illusion that there is nothing mapped in the kernel's part of the address space at all, by reporting all faults as level 0 translation faults taken to EL1. These fields are safe to pass through to userspace as they depend only on the instruction that userspace used to provoke the fault: EC IL (always) ISV CM WNR (for all data aborts) All the other fields in ESR except DFSC are architecturally RES0 for an L0 translation fault taken to EL1, so can be zeroed out without confusing userspace. The illusion is not entirely perfect, as there is a tiny wrinkle where we will report an alignment fault that was not due to the memory type (for instance a LDREX to an unaligned address) as a translation fault, whereas if you do this on real unmapped memory the alignment fault takes precedence. This is not likely to trip anybody up in practice, as the only users we know of for the ESR information who care about the behaviour for kernel addresses only really want to know about the WnR bit. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-05-22i40e: use the more traditional 'i' loop variableJacob Keller
Since we no longer use i as an array index for the data variable, replace the use of 'j' with 'i' so that we match the general loop variable name. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-22i40e: add function doc headers for ethtool stats functionsJacob Keller
Add documentation for the i40e_get_stats_count, i40e_get_stat_strings and i40e_get_ethtool_stats explaining that the number and ordering of statistics must remain constant for a given netdevice. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-22i40e: update data pointer directly when copying to the bufferJacob Keller
A future patch is going to add a helper function i40e_add_ethtool_stats that will help lower the amount of boiler plate code in the i40e_get_ethtool_stats function. This conversion will take place over many patches, and the helper function will work by directly updating a reference to the data pointer. Since this would not work combined with the current method of accessing data like an array, update all the code that copies stats into the data buffer to use direct updates to the pointer instead of array accesses. This will prevent incorrect stat updates for patches in between the conversion. Similarly, when copying strings, we used a separate char *p pointer. Instead, use the data pointer directly as it's already a (u8 *) type which is the same size. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-22i40e: fold prefix strings directly into stat namesJacob Keller
We always prefix these stats with a fixed string, so just fold this prefix into the stat string definition. This preparatory work will make it easier to implement a helper function to copy stats and strings into the supplied buffers in a future patch. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-22i40e: use WARN_ONCE to replace the commented BUG_ON size checkJacob Keller
We don't really want to use BUG_ON here since that would completely crash the kernel, thus the reason we commented it out. We *can't* use BUILD_BUG_ON because at least now (a) the sizes aren't constant (we are fixing this) and (b) not all compilers are smart enough to understand that "p - data" is a constant. Instead, just use a WARN_ONCE so that the first time we end up with an incorrect size we will dump a stack trace and a message, hopefully highlighting the issues early in testing. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-22i40e: split i40e_get_strings() into smaller functionsJacob Keller
Split the statistic strings and private flags strings into their own separate functions to aid code readability. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-22i40e: always return all queue stat stringsJacob Keller
The ethtool API for obtaining device statistics is not intended to allow runtime changes in the number of statistics reported. It may *appear* this way, as there is an ability to request the number of stats using ethtool_get_set_count(). However, it is expected that this must always return the same value for invocations of the same device. If we don't satisfy this contract, and allow the number of stats to change during run time, we could cause invalid memory accesses or report the stat strings incorrectly. This is because the API for obtaining stats is to (1) get the size, (2) get the strings and finally (3) get the stats. Since these are each separate ethtool op commands, it is not possible to maintain consistency by holding the RTNL lock over the whole operation. This results in the potential for a race condition to occur where the size changed between any of the 3 calls. Avoid this issue by requiring that we always return the same value for a given device. We can check any values which remain constant for the life of the device, but must not report different sizes depending on runtime attributes. This patch specifically fixes the queue statistics to always return every queue even if it's not currently in use. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-22i40e: always return VEB stat stringsJacob Keller
The ethtool API for obtaining device statistics is not intended to allow runtime changes in the number of statistics reported. It may *appear* this way, as there is an ability to request the number of stats using ethtool_get_set_count(). However, it is expected that this must always return the same value for invocations of the same device. If we don't satisfy this contract, and allow the number of stats to change during run time, we could cause invalid memory accesses or report the stat strings incorrectly. This is because the API for obtaining stats is to (1) get the size, (2) get the strings and finally (3) get the stats. Since these are each separate ethtool op commands, it is not possible to maintain consistency by holding the RTNL lock over the whole operation. This results in the potential for a race condition to occur where the size changed between any of the 3 calls. Avoid this issue by requiring that we always return the same value for a given device. We can check any values which remain constant for the life of the device, but must not report different sizes depending on runtime attributes. This patch specifically fixes the VEB statistics strings to always be reported. Other issues will be fixed in future patches. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-22i40e: free skb after clearing lock in ptp_stopJacob Keller
Use the same logic to free the skb after clearing the Tx timestamp bit lock in i40e_ptp_stop as we use in the other locations. It is not as important here since we are not racing against a future Tx timestamp request (as we are disabling PTP at this point). However it is good to be consistent in how we approach the bit lock so that future callers don't copy the old anti-pattern. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-05-22PM / core: Fix direct_complete handling for devices with no callbacksRafael J. Wysocki
Commit 08810a4119aa (PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags) inadvertently prevented the power.direct_complete flag from being set for devices without PM callbacks and with disabled runtime PM which also prevents power.direct_complete from being set for their parents. That led to problems including a resume crash on HP ZBook 14u. Restore the previous behavior by causing power.direct_complete to be set for those devices again, but do that in a more direct way to avoid overlooking that case in the future. Link: https://bugzilla.kernel.org/show_bug.cgi?id=199693 Fixes: 08810a4119aa (PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags) Reported-by: Thomas Martitz <kugel@rockbox.org> Tested-by: Thomas Martitz <kugel@rockbox.org> Cc: 4.15+ <stable@vger.kernel.org> # 4.15+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Johan Hovold <johan@kernel.org>
2018-05-22MAINTAINERS: change Kalle as wcn36xx maintainerKalle Valo
Eugene hasn't worked on wcn36xx for some time now. Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-05-22MAINTAINERS: change Kalle as ath.ko maintainerKalle Valo
Luis hasn't worked on ath.ko for some time now. Acked-by: Luis R. Rodriguez <mcgrof@kernel.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-05-22MAINTAINERS: update Kalle's email addressKalle Valo
I switched to use my codeaurora.org address. Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-05-22cfg80211: add missing kernel-docJohannes Berg
Add the kernel-doc missed earlier. Fixes: 52539ca89f36 ("cfg80211: Expose TXQ stats and parameters to userspace") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-05-22Merge branch 'bpf-fib-mtu-check'Daniel Borkmann
David Ahern says: ==================== Packets that exceed the egress MTU can not be forwarded in the fast path. Add IPv4 and IPv6 MTU helpers that take a FIB lookup result (versus the typical dst path) and add the calls to bpf_ipv{4,6}_fib_lookup. v2 - add ip6_mtu_from_fib6 to ipv6_stub - only call the new MTU helpers for fib lookups in XDP path; skb path uses is_skb_forwardable to determine if the packet can be sent via the egress device from the FIB lookup ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-22bpf: Add mtu checking to FIB forwarding helperDavid Ahern
Add check that egress MTU can handle packet to be forwarded. If the MTU is less than the packet length, return 0 meaning the packet is expected to continue up the stack for help - eg., fragmenting the packet or sending an ICMP. The XDP path needs to leverage the FIB entry for an MTU on the route spec or an exception entry for a given destination. The skb path lets is_skb_forwardable decide if the packet can be sent. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-22net/ipv6: Add helper to return path MTU based on fib resultDavid Ahern
Determine path MTU from a FIB lookup result. Logic is based on ip6_dst_mtu_forward plus lookup of nexthop exception. Add ip6_dst_mtu_forward to ipv6_stubs to handle access by core bpf code. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-22net/ipv4: Add helper to return path MTU based on fib resultDavid Ahern
Determine path MTU from a FIB lookup result. Logic is a distillation of ip_dst_mtu_maybe_forward. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-22Merge branch 'bpf-af-xdp-cleanups'Daniel Borkmann
Björn Töpel says: ==================== This the second follow-up set. The first four patches are uapi changes: * Removing rebind support * Getting rid of structure hole * Removing explicit cache line alignment * Stricter bind checks The last patches do some cleanups, where the umem and refcount_t changes were suggested by Daniel. * Add a missing write-barrier and use READ_ONCE for data-dependencies * Clean up umem and do proper locking * Convert atomic_t to refcount_t ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-22xsk: convert atomic_t to refcount_tBjörn Töpel
Introduce refcount_t, in favor of atomic_t. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-22xsk: simplified umem setupBjörn Töpel
As suggested by Daniel Borkmann, the umem setup code was a too defensive and complex. Here, we reduce the number of checks. Also, the memory pinning is now folded into the umem creation, and we do correct locking. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-22xsk: add missing write- and data-dependency barrierBjörn Töpel
Here, we add a missing write-barrier, and use READ_ONCE for the data-dependency barrier. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-22samples/bpf: adapt xdpsock to the new uapiBjörn Töpel
Adapt xdpsock to use the new getsockopt introduced in the previous commit. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-22xsk: remove explicit ring structure from uapiBjörn Töpel
In this commit we remove the explicit ring structure from the the uapi. It is tricky for an uapi to depend on a certain L1 cache line size, since it can differ for variants of the same architecture. Now, we let the user application determine the offsets of the producer, consumer and descriptors by asking the socket via getsockopt. A typical flow would be (Rx ring): struct xdp_mmap_offsets off; struct xdp_desc *ring; u32 *prod, *cons; void *map; ... getsockopt(fd, SOL_XDP, XDP_MMAP_OFFSETS, &off, &optlen); map = mmap(NULL, off.rx.desc + NUM_DESCS * sizeof(struct xdp_desc), PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE, sfd, XDP_PGOFF_RX_RING); prod = map + off.rx.producer; cons = map + off.rx.consumer; ring = map + off.rx.desc; Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-22xsk: proper queue id check at bindMagnus Karlsson
Validate the queue id against both Rx and Tx on the netdev. Also, make sure that the queue exists at xmit time. Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-22xsk: fill hole in struct sockaddr_xdpBjörn Töpel
Move the sxdp_flags up, avoiding a hole in the uapi structure. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-22xsk: remove rebind supportBjörn Töpel
Supporting rebind, i.e. after a successful bind the process can call bind again without closing the socket, makes the AF_XDP setup state machine more complex. Constrain the state space, by not supporting rebind. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-05-22mac80211_hwsim: Fix radio dump for radio idx 0Andrew Zaborowski
Since 6335698e24ec11e1324b916177da6721df724dd8 the radio with idx of 0 will not get dumped in HWSIM_CMD_GET_RADIO because of the last_idx checks. Offset cb->args[0] by 1 similarly to what is done in nl80211.c. Fixes: 6335698e24ec ("mac80211_hwsim: add generation count for netlink dump operation") Signed-off-by: Andrew Zaborowski <andrew.zaborowski@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-05-22cfg80211: fix NULL pointer derference when querying regdbHaim Dreyfuss
Some drivers may call this function when regdb is not initialized yet, so we need to make sure regdb is valid before trying to access it. Make sure regdb is initialized before trying to access it in reg_query_regdb_wmm() and query_regdb(). Reported-by: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-05-22nl80211: Fix compilationDenis Kenzior
Commit 7ea3e110f2f8ba23f330c2f702f556acd539bcb8 seems to have introduced: net/wireless/nl80211.c: In function ‘nl80211_get_station’: net/wireless/nl80211.c:4802:34: error: incompatible type for argument 1 of ‘cfg80211_sinfo_release_content’ cfg80211_sinfo_release_content(sinfo); ^~~~~ In file included from net/wireless/nl80211.c:24:0: ./include/net/cfg80211.h:5721:20: note: expected ‘struct station_info *’ but argument is of type ‘struct station_info’ static inline void cfg80211_sinfo_release_content(struct station_info *sinfo) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fixes: 7ea3e110f2f8 ("cfg80211: release station info tidstats where needed") Signed-off-by: Denis Kenzior <denkenz@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2018-05-21powerpc/64s: Add support for a store forwarding barrier at kernel entry/exitNicholas Piggin
On some CPUs we can prevent a vulnerability related to store-to-load forwarding by preventing store forwarding between privilege domains, by inserting a barrier in kernel entry and exit paths. This is known to be the case on at least Power7, Power8 and Power9 powerpc CPUs. Barriers must be inserted generally before the first load after moving to a higher privilege, and after the last store before moving to a lower privilege, HV and PR privilege transitions must be protected. Barriers are added as patch sections, with all kernel/hypervisor entry points patched, and the exit points to lower privilge levels patched similarly to the RFI flush patching. Firmware advertisement is not implemented yet, so CPU flush types are hard coded. Thanks to Michal Suchánek for bug fixes and review. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michal Suchánek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-22Merge branch 'drm/du/fixes' of git://linuxtv.org/pinchartl/media into drm-fixesDave Airlie
Single regression fix for rcar-du lvds * 'drm/du/fixes' of git://linuxtv.org/pinchartl/media: drm: rcar-du: lvds: Fix crash in .atomic_check when disabling connector
2018-05-21Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two driver fixes (zfcp and target core), one information leak in sg and one build clean up" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: sg: allocate with __GFP_ZERO in sg_build_indirect() scsi: core: clean up generated file scsi_devinfo_tbl.c scsi: target: tcmu: fix error resetting qfull_time_out to default scsi: zfcp: fix infinite iteration on ERP ready list
2018-05-21Merge branch 'TI-Ethernet-driver-warnings-fixes'David S. Miller
Florian Fainelli says: ==================== TI Ethernet driver warnings fixes This patch series attempts to fix properly the warnings observed with turning on COMPILE_TEST and TI Ethernet drivers on 64-bit hosts. Since I don't have any of this hardware, please review carefully for possible breakage! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>