summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-12-29mm: add prototype for __add_to_page_cache_locked()Souptick Joarder
Otherwise it causes a gcc warning: mm/filemap.c:830:14: warning: no previous prototype for `__add_to_page_cache_locked' [-Wmissing-prototypes] A previous attempt to make this function static led to compilation errors when CONFIG_DEBUG_INFO_BTF is enabled because __add_to_page_cache_locked() is referred to by BPF code. Adding a prototype will silence the warning. Link: https://lkml.kernel.org/r/1608693702-4665-1-git-send-email-jrdr.linux@gmail.com Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Cc: Alex Shi <alex.shi@linux.alibaba.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-29checkpatch: prefer strscpy to strlcpyJoe Perches
Prefer strscpy over the deprecated strlcpy function. Link: https://lkml.kernel.org/r/19fe91084890e2c16fe56f960de6c570a93fa99b.camel@perches.com Signed-off-by: Joe Perches <joe@perches.com> Requested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-29Revert "kbuild: avoid static_assert for genksyms"Masahiro Yamada
This reverts commit 14dc3983b5dff513a90bd5a8cc90acaf7867c3d0. Macro Elver had sent a fix proper fix earlier, and also pointed out corner cases: "I guess what you propose is simpler, but might still have corner cases where we still get warnings. In particular, if some file (for whatever reason) does not include build_bug.h and uses a raw _Static_assert(), then we still get warnings. E.g. I see 1 user of raw _Static_assert() (drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h )." I believe the raw use of _Static_assert() should be allowed, so this should be fixed in genksyms. Even after commit 14dc3983b5df ("kbuild: avoid static_assert for genksyms"), I confirmed the following test code emits the warning. ---------------->8---------------- #include <linux/export.h> _Static_assert((1 ?: 0), ""); void foo(void) { } EXPORT_SYMBOL(foo); ---------------->8---------------- WARNING: modpost: EXPORT symbol "foo" [vmlinux] version generation failed, symbol will not be versioned. Now that commit 869b91992bce ("genksyms: Ignore module scoped _Static_assert()") fixed this issue properly, the workaround should be reverted. Link: https://lkml.org/lkml/2020/12/10/845 Link: https://lkml.kernel.org/r/20201219183911.181442-1-masahiroy@kernel.org Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Cc: Marco Elver <elver@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-29mm/hugetlb: fix deadlock in hugetlb_cow error pathMike Kravetz
syzbot reported the deadlock here [1]. The issue is in hugetlb cow error handling when there are not enough huge pages for the faulting task which took the original reservation. It is possible that other (child) tasks could have consumed pages associated with the reservation. In this case, we want the task which took the original reservation to succeed. So, we unmap any associated pages in children so that they can be used by the faulting task that owns the reservation. The unmapping code needs to hold i_mmap_rwsem in write mode. However, due to commit c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") we are already holding i_mmap_rwsem in read mode when hugetlb_cow is called. Technically, i_mmap_rwsem does not need to be held in read mode for COW mappings as they can not share pmd's. Modifying the fault code to not take i_mmap_rwsem in read mode for COW (and other non-sharable) mappings is too involved for a stable fix. Instead, we simply drop the hugetlb_fault_mutex and i_mmap_rwsem before unmapping. This is OK as it is technically not needed. They are reacquired after unmapping as expected by calling code. Since this is done in an uncommon error path, the overhead of dropping and reacquiring mutexes is acceptable. While making changes, remove redundant BUG_ON after unmap_ref_private. [1] https://lkml.kernel.org/r/000000000000b73ccc05b5cf8558@google.com Link: https://lkml.kernel.org/r/4c5781b8-3b00-761e-c0c7-c5edebb6ec1a@oracle.com Fixes: c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reported-by: syzbot+5eee4145df3c15e96625@syzkaller.appspotmail.com Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-29selftests/vm: fix building protection keys testHarish
Commit d8cbe8bfa7d ("tools/testing/selftests/vm: fix build error") tried to include a ARCH check for powerpc, however ARCH is not defined in the Makefile before including lib.mk. This makes test building to skip on both x86 and powerpc. Fix the arch check by replacing it using machine type as it is already defined and used in the test. Link: https://lkml.kernel.org/r/20201215100402.257376-1-harish@linux.ibm.com Fixes: d8cbe8bfa7d ("tools/testing/selftests/vm: fix build error") Signed-off-by: Harish <harish@linux.ibm.com> Reviewed-by: Sandipan Das <sandipan@linux.ibm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Sandipan Das <sandipan@linux.ibm.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-29io_uring: don't assume mm is constant across submitsJens Axboe
If we COW the identity, we assume that ->mm never changes. But this isn't true of multiple processes end up sharing the ring. Hence treat id->mm like like any other process compontent when it comes to the identity mapping. This is pretty trivial, just moving the existing grab into io_grab_identity(), and including a check for the match. Cc: stable@vger.kernel.org # 5.10 Fixes: 1e6fa5216a0e ("io_uring: COW io_identity on mismatch") Reported-by: Christian Brauner <christian.brauner@ubuntu.com>: Tested-by: Christian Brauner <christian.brauner@ubuntu.com>: Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2020-12-28 The following pull-request contains BPF updates for your *net* tree. There is a small merge conflict between bpf tree commit 69ca310f3416 ("bpf: Save correct stopping point in file seq iteration") and net tree commit 66ed594409a1 ("bpf/task_iter: In task_file_seq_get_next use task_lookup_next_fd_rcu"). The get_files_struct() does not exist anymore in net, so take the hunk in HEAD and add the `info->tid = curr_tid` to the error path: [...] curr_task = task_seq_get_next(ns, &curr_tid, true); if (!curr_task) { info->task = NULL; info->tid = curr_tid; return NULL; } /* set info->task and info->tid */ [...] We've added 10 non-merge commits during the last 9 day(s) which contain a total of 11 files changed, 75 insertions(+), 20 deletions(-). The main changes are: 1) Various AF_XDP fixes such as fill/completion ring leak on failed bind and fixing a race in skb mode's backpressure mechanism, from Magnus Karlsson. 2) Fix latency spikes on lockdep enabled kernels by adding a rescheduling point to BPF hashtab initialization, from Eric Dumazet. 3) Fix a splat in task iterator by saving the correct stopping point in the seq file iteration, from Jonathan Lemon. 4) Fix BPF maps selftest by adding retries in case hashtab returns EBUSY errors on update/deletes, from Andrii Nakryiko. 5) Fix BPF selftest error reporting to something more user friendly if the vmlinux BTF cannot be found, from Kamal Mostafa. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-28net: hdlc_ppp: Fix issues when mod_timer is called while timer is runningXie He
ppp_cp_event is called directly or indirectly by ppp_rx with "ppp->lock" held. It may call mod_timer to add a new timer. However, at the same time ppp_timer may be already running and waiting for "ppp->lock". In this case, there's no need for ppp_timer to continue running and it can just exit. If we let ppp_timer continue running, it may call add_timer. This causes kernel panic because add_timer can't be called with a timer pending. This patch fixes this problem. Fixes: e022c2f07ae5 ("WAN: new synchronous PPP implementation for generic HDLC.") Cc: Krzysztof Halasa <khc@pm.waw.pl> Signed-off-by: Xie He <xie.he.0141@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-28atlantic: remove architecture dependsLéo Le Bouter
This was tested on a RaptorCS Talos II with IBM POWER9 DD2.2 CPUs and an ASUS XG-C100F PCI-e card without any issue. Speeds of ~8Gbps could be attained with not-very-scientific (wget HTTP) both-ways measurements on a local network. No warning or error reported in kernel logs. The drivers seems to be portable enough for it not to be gated like such. Signed-off-by: Léo Le Bouter <lle-bout@zaclys.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-28erspan: fix version 1 check in gre_parse_header()Cong Wang
Both version 0 and version 1 use ETH_P_ERSPAN, but version 0 does not have an erspan header. So the check in gre_parse_header() is wrong, we have to distinguish version 1 from version 0. We can just check the gre header length like is_erspan_type1(). Fixes: cb73ee40b1b3 ("net: ip_gre: use erspan key field for tunnel lookup") Reported-by: syzbot+f583ce3d4ddf9836b27a@syzkaller.appspotmail.com Cc: William Tu <u9012063@gmail.com> Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-28net: hns: fix return value check in __lb_other_process()Yunjian Wang
The function skb_copy() could return NULL, the return value need to be checked. Fixes: b5996f11ea54 ("net: add Hisilicon Network Subsystem basic ethernet support") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-28net: sched: prevent invalid Scell_log shift countRandy Dunlap
Check Scell_log shift size in red_check_params() and modify all callers of red_check_params() to pass Scell_log. This prevents a shift out-of-bounds as detected by UBSAN: UBSAN: shift-out-of-bounds in ./include/net/red.h:252:22 shift exponent 72 is too large for 32-bit type 'int' Fixes: 8afa10cbe281 ("net_sched: red: Avoid illegal values") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: syzbot+97c5bd9cc81eca63d36e@syzkaller.appspotmail.com Cc: Nogah Frankel <nogahf@mellanox.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@resnulli.us> Cc: netdev@vger.kernel.org Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-28net: neighbor: fix a crash caused by mod zeroweichenchen
pneigh_enqueue() tries to obtain a random delay by mod NEIGH_VAR(p, PROXY_DELAY). However, NEIGH_VAR(p, PROXY_DELAY) migth be zero at that point because someone could write zero to /proc/sys/net/ipv4/neigh/[device]/proxy_delay after the callers check it. This patch uses prandom_u32_max() to get a random delay instead which avoids potential division by zero. Signed-off-by: weichenchen <weichen.chen@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-28ipv4: Ignore ECN bits for fib lookups in fib_compute_spec_dst()Guillaume Nault
RT_TOS() only clears one of the ECN bits. Therefore, when fib_compute_spec_dst() resorts to a fib lookup, it can return different results depending on the value of the second ECN bit. For example, ECT(0) and ECT(1) packets could be treated differently. $ ip netns add ns0 $ ip netns add ns1 $ ip link add name veth01 netns ns0 type veth peer name veth10 netns ns1 $ ip -netns ns0 link set dev lo up $ ip -netns ns1 link set dev lo up $ ip -netns ns0 link set dev veth01 up $ ip -netns ns1 link set dev veth10 up $ ip -netns ns0 address add 192.0.2.10/24 dev veth01 $ ip -netns ns1 address add 192.0.2.11/24 dev veth10 $ ip -netns ns1 address add 192.0.2.21/32 dev lo $ ip -netns ns1 route add 192.0.2.10/32 tos 4 dev veth10 src 192.0.2.21 $ ip netns exec ns1 sysctl -wq net.ipv4.icmp_echo_ignore_broadcasts=0 With TOS 4 and ECT(1), ns1 replies using source address 192.0.2.21 (ping uses -Q to set all TOS and ECN bits): $ ip netns exec ns0 ping -c 1 -b -Q 5 192.0.2.255 [...] 64 bytes from 192.0.2.21: icmp_seq=1 ttl=64 time=0.544 ms But with TOS 4 and ECT(0), ns1 replies using source address 192.0.2.11 because the "tos 4" route isn't matched: $ ip netns exec ns0 ping -c 1 -b -Q 6 192.0.2.255 [...] 64 bytes from 192.0.2.11: icmp_seq=1 ttl=64 time=0.597 ms After this patch the ECN bits don't affect the result anymore: $ ip netns exec ns0 ping -c 1 -b -Q 6 192.0.2.255 [...] 64 bytes from 192.0.2.21: icmp_seq=1 ttl=64 time=0.591 ms Fixes: 35ebf65e851c ("ipv4: Create and use fib_compute_spec_dst() helper.") Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-28net: mvpp2: fix pkt coalescing int-threshold configurationStefan Chulski
The packet coalescing interrupt threshold has separated registers for different aggregated/cpu (sw-thread). The required value should be loaded for every thread but not only for 1 current cpu. Fixes: 213f428f5056 ("net: mvpp2: add support for TX interrupts and RX queue distribution modes") Signed-off-by: Stefan Chulski <stefanc@marvell.com> Link: https://lore.kernel.org/r/1608748521-11033-1-git-send-email-stefanc@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-28Merge branch 'net-ipa-fix-some-new-build-warnings'Jakub Kicinski
Alex Elder says: ==================== net: ipa: fix some new build warnings I got a super friendly message from the Intel kernel test robot that pointed out that two patches I posted last week caused new build warnings. I already had these problems fixed in my own tree but the fix was not included in what I sent out last week. ==================== Link: https://lore.kernel.org/r/20201226213737.338928-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-28net: ipa: don't return a value from evt_ring_command()Alex Elder
Callers of evt_ring_command() no longer care whether the command times out, and don't use what evt_ring_command() returns. Redefine that function to have void return type. Reported-by: kernel test robot <lkp@intel.com> Fixes: 428b448ee764a ("net: ipa: use state to determine event ring command success") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-28net: ipa: don't return a value from gsi_channel_command()Alex Elder
Callers of gsi_channel_command() no longer care whether the command times out, and don't use what gsi_channel_command() returns. Redefine that function to have void return type. Reported-by: kernel test robot <lkp@intel.com> Fixes: 6ffddf3b3d182 ("net: ipa: use state to determine channel command success") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-28Merge branch 'bnxt_en-bug-fixes'Jakub Kicinski
Michael Chan says: ==================== bnxt_en: Bug fixes. The first patch fixes recovery of fatal AER errors. The second one fixes a potential array out of bounds issue. ==================== Link: https://lore.kernel.org/r/1609096698-15009-1-git-send-email-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-28bnxt_en: Check TQM rings for maximum supported value.Michael Chan
TQM rings are hardware resources that require host context memory managed by the driver. The driver supports up to 9 TQM rings and the number of rings to use is requested by firmware during run-time. Cap this number to the maximum supported to prevent accessing beyond the array. Future firmware may request more than 9 TQM rings. Define macros to remove the magic number 9 from the C code. Fixes: ac3158cb0108 ("bnxt_en: Allocate TQM ring context memory according to fw specification.") Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-28bnxt_en: Fix AER recovery.Vasundhara Volam
A recent change skips sending firmware messages to the firmware when pci_channel_offline() is true during fatal AER error. To make this complete, we need to move the re-initialization sequence to bnxt_io_resume(), otherwise the firmware messages to re-initialize will all be skipped. In any case, it is more correct to re-initialize in bnxt_io_resume(). Also, fix the reverse x-mas tree format when defining variables in bnxt_io_slot_reset(). Fixes: b340dc680ed4 ("bnxt_en: Avoid sending firmware messages when AER error is detected.") Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-28Merge branch '1GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2020-12-23 Commit e086ba2fccda ("e1000e: disable s0ix entry and exit flows for ME systems") disabled S0ix flows for systems that have various incarnations of the i219-LM ethernet controller. This was done because of some regressions caused by an earlier commit 632fbd5eb5b0e ("e1000e: fix S0ix flows for cable connected case") with i219-LM controller. Per discussion with Intel architecture team this direction should be changed and allow S0ix flows to be used by default. This patch series includes directional changes for their conclusions in https://lkml.org/lkml/2020/12/13/15. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: e1000e: Export S0ix flags to ethtool Revert "e1000e: disable s0ix entry and exit flows for ME systems" e1000e: bump up timeout to wait when ME un-configures ULP mode e1000e: Only run S0ix flows if shutdown succeeded ==================== Link: https://lore.kernel.org/r/20201223233625.92519-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-28net: mptcp: cap forward allocation to 1MDavide Caratti
the following syzkaller reproducer: r0 = socket$inet_mptcp(0x2, 0x1, 0x106) bind$inet(r0, &(0x7f0000000080)={0x2, 0x4e24, @multicast2}, 0x10) connect$inet(r0, &(0x7f0000000480)={0x2, 0x4e24, @local}, 0x10) sendto$inet(r0, &(0x7f0000000100)="f6", 0xffffffe7, 0xc000, 0x0, 0x0) systematically triggers the following warning: WARNING: CPU: 2 PID: 8618 at net/core/stream.c:208 sk_stream_kill_queues+0x3fa/0x580 Modules linked in: CPU: 2 PID: 8618 Comm: syz-executor Not tainted 5.10.0+ #334 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/04 RIP: 0010:sk_stream_kill_queues+0x3fa/0x580 Code: df 48 c1 ea 03 0f b6 04 02 84 c0 74 04 3c 03 7e 40 8b ab 20 02 00 00 e9 64 ff ff ff e8 df f0 81 2 RSP: 0018:ffffc9000290fcb0 EFLAGS: 00010293 RAX: ffff888011cb8000 RBX: 0000000000000000 RCX: ffffffff86eecf0e RDX: 0000000000000000 RSI: ffffffff86eecf6a RDI: 0000000000000005 RBP: 0000000000000e28 R08: ffff888011cb8000 R09: fffffbfff1f48139 R10: ffffffff8fa409c7 R11: fffffbfff1f48138 R12: ffff8880215e6220 R13: ffffffff8fa409c0 R14: ffffc9000290fd30 R15: 1ffff92000521fa2 FS: 00007f41c78f4800(0000) GS:ffff88802d000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f95c803d088 CR3: 0000000025ed2000 CR4: 00000000000006f0 Call Trace: __mptcp_destroy_sock+0x4f5/0x8e0 mptcp_close+0x5e2/0x7f0 inet_release+0x12b/0x270 __sock_release+0xc8/0x270 sock_close+0x18/0x20 __fput+0x272/0x8e0 task_work_run+0xe0/0x1a0 exit_to_user_mode_prepare+0x1df/0x200 syscall_exit_to_user_mode+0x19/0x50 entry_SYSCALL_64_after_hwframe+0x44/0xa9 userspace programs provide arbitrarily high values of 'len' in sendmsg(): this is causing integer overflow of 'amount'. Cap forward allocation to 1 megabyte: higher values are not really useful. Suggested-by: Paolo Abeni <pabeni@redhat.com> Fixes: e93da92896bc ("mptcp: implement wmem reservation") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Link: https://lore.kernel.org/r/3334d00d8b2faecafdfab9aa593efcbf61442756.1608584474.git.dcaratti@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-28tun: fix return value when the number of iovs exceeds MAX_SKB_FRAGSYunjian Wang
Currently the tun_napi_alloc_frags() function returns -ENOMEM when the number of iovs exceeds MAX_SKB_FRAGS + 1. However this is inappropriate, we should use -EMSGSIZE instead of -ENOMEM. The following distinctions are matters: 1. the caller need to drop the bad packet when -EMSGSIZE is returned, which means meeting a persistent failure. 2. the caller can try again when -ENOMEM is returned, which means meeting a transient failure. Fixes: 90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP driver") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/1608864736-24332-1-git-send-email-wangyunjian@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-28net: ethernet: ti: cpts: fix ethtool output when no ptp_clock registeredGrygorii Strashko
The CPTS driver registers PTP PHC clock when first netif is going up and unregister it when all netif are down. Now ethtool will show: - PTP PHC clock index 0 after boot until first netif is up; - the last assigned PTP PHC clock index even if PTP PHC clock is not registered any more after all netifs are down. This patch ensures that -1 is returned by ethtool when PTP PHC clock is not registered any more. Fixes: 8a2c9a5ab4b9 ("net: ethernet: ti: cpts: rework initialization/deinitialization") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://lore.kernel.org/r/20201224162405.28032-1-grygorii.strashko@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-28Merge tag 'for-5.11/dm-fix' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fix from Mike Snitzer: "Revert WQ_SYSFS change that broke reencryption (and all other functionality that requires reloading a dm-crypt DM table)" * tag 'for-5.11/dm-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: Revert "dm crypt: export sysfs of kcryptd workqueue"
2020-12-28Merge branch 'net-sysfs-fix-race-conditions-in-the-xps-code'Jakub Kicinski
Antoine Tenart says: ==================== net-sysfs: fix race conditions in the xps code This series fixes race conditions in the xps code, where out of bound accesses can occur when dev->num_tc is updated, triggering oops. The root cause is linked to locking issues. An explanation is given in each of the commit logs. We had a discussion on the v1 of this series about using the xps_map mutex instead of the rtnl lock. While that seemed a better compromise, v2 showed the added complexity wasn't best for fixes. So we decided to go back to v1 and use the rtnl lock. Because of this, the only differences between v1 and v3 are improvements in the commit messages. ==================== Link: https://lore.kernel.org/r/20201223212323.3603139-1-atenart@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-28net-sysfs: take the rtnl lock when accessing xps_rxqs_map and num_tcAntoine Tenart
Accesses to dev->xps_rxqs_map (when using dev->num_tc) should be protected by the rtnl lock, like we do for netif_set_xps_queue. I didn't see an actual bug being triggered, but let's be safe here and take the rtnl lock while accessing the map in sysfs. Fixes: 8af2c06ff4b1 ("net-sysfs: Add interface for Rx queue(s) map per Tx queue") Signed-off-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-28net-sysfs: take the rtnl lock when storing xps_rxqsAntoine Tenart
Two race conditions can be triggered when storing xps rxqs, resulting in various oops and invalid memory accesses: 1. Calling netdev_set_num_tc while netif_set_xps_queue: - netif_set_xps_queue uses dev->tc_num as one of the parameters to compute the size of new_dev_maps when allocating it. dev->tc_num is also used to access the map, and the compiler may generate code to retrieve this field multiple times in the function. - netdev_set_num_tc sets dev->tc_num. If new_dev_maps is allocated using dev->tc_num and then dev->tc_num is set to a higher value through netdev_set_num_tc, later accesses to new_dev_maps in netif_set_xps_queue could lead to accessing memory outside of new_dev_maps; triggering an oops. 2. Calling netif_set_xps_queue while netdev_set_num_tc is running: 2.1. netdev_set_num_tc starts by resetting the xps queues, dev->tc_num isn't updated yet. 2.2. netif_set_xps_queue is called, setting up the map with the *old* dev->num_tc. 2.3. netdev_set_num_tc updates dev->tc_num. 2.4. Later accesses to the map lead to out of bound accesses and oops. A similar issue can be found with netdev_reset_tc. One way of triggering this is to set an iface up (for which the driver uses netdev_set_num_tc in the open path, such as bnx2x) and writing to xps_rxqs in a concurrent thread. With the right timing an oops is triggered. Both issues have the same fix: netif_set_xps_queue, netdev_set_num_tc and netdev_reset_tc should be mutually exclusive. We do that by taking the rtnl lock in xps_rxqs_store. Fixes: 8af2c06ff4b1 ("net-sysfs: Add interface for Rx queue(s) map per Tx queue") Signed-off-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-28net-sysfs: take the rtnl lock when accessing xps_cpus_map and num_tcAntoine Tenart
Accesses to dev->xps_cpus_map (when using dev->num_tc) should be protected by the rtnl lock, like we do for netif_set_xps_queue. I didn't see an actual bug being triggered, but let's be safe here and take the rtnl lock while accessing the map in sysfs. Fixes: 184c449f91fe ("net: Add support for XPS with QoS via traffic classes") Signed-off-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-28net-sysfs: take the rtnl lock when storing xps_cpusAntoine Tenart
Two race conditions can be triggered when storing xps cpus, resulting in various oops and invalid memory accesses: 1. Calling netdev_set_num_tc while netif_set_xps_queue: - netif_set_xps_queue uses dev->tc_num as one of the parameters to compute the size of new_dev_maps when allocating it. dev->tc_num is also used to access the map, and the compiler may generate code to retrieve this field multiple times in the function. - netdev_set_num_tc sets dev->tc_num. If new_dev_maps is allocated using dev->tc_num and then dev->tc_num is set to a higher value through netdev_set_num_tc, later accesses to new_dev_maps in netif_set_xps_queue could lead to accessing memory outside of new_dev_maps; triggering an oops. 2. Calling netif_set_xps_queue while netdev_set_num_tc is running: 2.1. netdev_set_num_tc starts by resetting the xps queues, dev->tc_num isn't updated yet. 2.2. netif_set_xps_queue is called, setting up the map with the *old* dev->num_tc. 2.3. netdev_set_num_tc updates dev->tc_num. 2.4. Later accesses to the map lead to out of bound accesses and oops. A similar issue can be found with netdev_reset_tc. One way of triggering this is to set an iface up (for which the driver uses netdev_set_num_tc in the open path, such as bnx2x) and writing to xps_cpus in a concurrent thread. With the right timing an oops is triggered. Both issues have the same fix: netif_set_xps_queue, netdev_set_num_tc and netdev_reset_tc should be mutually exclusive. We do that by taking the rtnl lock in xps_cpus_store. Fixes: 184c449f91fe ("net: Add support for XPS with QoS via traffic classes") Signed-off-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-28CDC-NCM: remove "connected" log messageRoland Dreier
The cdc_ncm driver passes network connection notifications up to usbnet_link_change(), which is the right place for any logging. Remove the netdev_info() duplicating this from the driver itself. This stops devices such as my "TRENDnet USB 10/100/1G/2.5G LAN" (ID 20f4:e02b) adapter from spamming the kernel log with cdc_ncm 2-2:2.0 enp0s2u2c2: network connection: connected messages every 60 msec or so. Signed-off-by: Roland Dreier <roland@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20201224032116.2453938-1-roland@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-28Revert "dm crypt: export sysfs of kcryptd workqueue"Mike Snitzer
This reverts commit a2b8b2d975673b1a50ab0bcce5d146b9335edfad. WQ_SYSFS breaks the ability to reload a DM table due to sysfs kobject collision (due to active and inactive table). Given lack of demonstrated need for exposing this workqueue via sysfs: revert exposing it. Reported-by: Ignat Korchagin <ignat@cloudflare.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-12-28libceph: add __maybe_unused to DEFINE_MSGR2_FEATUREIlya Dryomov
Avoid -Wunused-const-variable warnings for "make W=1". Reported-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-28libceph: align session_key and con_secret to 16 bytesIlya Dryomov
crypto_shash_setkey() and crypto_aead_setkey() will do a (small) GFP_ATOMIC allocation to align the key if it isn't suitably aligned. It's not a big deal, but at the same time easy to avoid. The actual alignment requirement is dynamic, queryable with crypto_shash_alignmask() and crypto_aead_alignmask(), but shouldn't be stricter than 16 bytes for our algorithms. Fixes: cd1a677cad99 ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)") Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-28libceph: fix auth_signature buffer allocation in secure modeIlya Dryomov
auth_signature frame is 68 bytes in plain mode and 96 bytes in secure mode but we are requesting 68 bytes in both modes. By luck, this doesn't actually result in any invalid memory accesses because the allocation is satisfied out of kmalloc-96 slab and so exactly 96 bytes are allocated, but KASAN rightfully complains. Fixes: cd1a677cad99 ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)") Reported-by: Luis Henriques <lhenriques@suse.de> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2020-12-28ceph: reencode gid_list when reconnectingIlya Dryomov
On reconnect, cap and dentry releases are dropped and the fields that follow must be reencoded into the freed space. Currently these are timestamp and gid_list, but gid_list isn't reencoded. This results in failed to decode message of type 24 v4: End of buffer errors on the MDS. While at it, make a change to encode gid_list unconditionally, without regard to what head/which version was used as a result of checking whether CEPH_FEATURE_FS_BTIME is supported or not. URL: https://tracker.ceph.com/issues/48618 Fixes: 4f1ddb1ea874 ("ceph: implement updated ceph_mds_request_head structure") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2020-12-28Merge branch 'for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds
Pull workqueue update from Tejun Heo: "The same as the cgroup tree - one commit which was scheduled for the 5.11 merge window. All the commit does is avoding spurious worker wakeups from workqueue allocation / config change path to help cpuisol use cases" * 'for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: Kick a worker based on the actual activation of delayed works
2020-12-28Merge branch 'for-5.11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "These three patches were scheduled for the merge window but I forgot to send them out. Sorry about that. None of them are significant and they fit well in a fix pull request too - two are cosmetic and one fixes a memory leak in the mount option parsing path" * 'for-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: Fix memory leak when parsing multiple source parameters cgroup/cgroup.c: replace 'of->kn->priv' with of_cft() kernel: cgroup: Mundane spelling fixes throughout the file
2020-12-28netfilter: nftables: add set expression flagsPablo Neira Ayuso
The set flag NFT_SET_EXPR provides a hint to the kernel that userspace supports for multiple expressions per set element. In the same direction, NFT_DYNSET_F_EXPR specifies that dynset expression defines multiple expressions per set element. This allows new userspace software with old kernels to bail out with EOPNOTSUPP. This update is similar to ef516e8625dd ("netfilter: nf_tables: reintroduce the NFT_SET_CONCAT flag"). The NFT_SET_EXPR flag needs to be set on when the NFTA_SET_EXPRESSIONS attribute is specified. The NFT_SET_EXPR flag is not set on with NFTA_SET_EXPR to retain backward compatibility in old userspace binaries. Fixes: 48b0ae046ee9 ("netfilter: nftables: netlink support for several set element expressions") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-12-28netfilter: nft_dynset: report EOPNOTSUPP on missing set featurePablo Neira Ayuso
If userspace requests a feature which is not available the original set definition, then bail out with EOPNOTSUPP. If userspace sends unsupported dynset flags (new feature not supported by this kernel), then report EOPNOTSUPP to userspace. EINVAL should be only used to report malformed netlink messages from userspace. Fixes: 22fe54d5fefc ("netfilter: nf_tables: add support for dynamic set updates") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-12-28opp: Call the missing clk_put() on errorViresh Kumar
Fix the clock reference counting by calling the missing clk_put() in the error path. Cc: v5.10 <stable@vger.kernel.org> # v5.10 Fixes: dd461cd9183f ("opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER") Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2020-12-28opp: fix memory leak in _allocate_opp_tableQuanyang Wang
In function _allocate_opp_table, opp_dev is allocated and referenced by opp_table via _add_opp_dev. But in the case that the subsequent calls return -EPROBE_DEFER, it will jump to err label and opp_table will be freed. Then opp_dev becomes an unreferenced object to cause memory leak. So let's call _remove_opp_dev to do the cleanup. This fixes the following kmemleak report: unreferenced object 0xffff000801524a00 (size 128): comm "swapper/0", pid 1, jiffies 4294892465 (age 84.616s) hex dump (first 32 bytes): 40 00 56 01 08 00 ff ff 40 00 56 01 08 00 ff ff @.V.....@.V..... b8 52 77 7f 08 00 ff ff 00 3c 4c 00 08 00 ff ff .Rw......<L..... backtrace: [<00000000b1289fb1>] kmemleak_alloc+0x30/0x40 [<0000000056da48f0>] kmem_cache_alloc+0x3d4/0x588 [<00000000a84b3b0e>] _add_opp_dev+0x2c/0x88 [<0000000062a380cd>] _add_opp_table_indexed+0x124/0x268 [<000000008b4c8f1f>] dev_pm_opp_of_add_table+0x20/0x1d8 [<00000000e5316798>] dev_pm_opp_of_cpumask_add_table+0x48/0xf0 [<00000000db0a8ec2>] dt_cpufreq_probe+0x20c/0x448 [<0000000030a3a26c>] platform_probe+0x68/0xd8 [<00000000c618e78d>] really_probe+0xd0/0x3a0 [<00000000642e856f>] driver_probe_device+0x58/0xb8 [<00000000f10f5307>] device_driver_attach+0x74/0x80 [<0000000004f254b8>] __driver_attach+0x58/0xe0 [<0000000009d5d19e>] bus_for_each_dev+0x70/0xc8 [<0000000000d22e1c>] driver_attach+0x24/0x30 [<0000000001d4e952>] bus_add_driver+0x14c/0x1f0 [<0000000089928aaa>] driver_register+0x64/0x120 Cc: v5.10 <stable@vger.kernel.org> # v5.10 Fixes: dd461cd9183f ("opp: Allow dev_pm_opp_get_opp_table() to return -EPROBE_DEFER") Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com> [ Viresh: Added the stable tag ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2020-12-27Linux 5.11-rc1v5.11-rc1Linus Torvalds
2020-12-27proc mountinfo: make splice available againLinus Torvalds
Since commit 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops") we've required that file operation structures explicitly enable splice support, rather than falling back to the default handlers. Most /proc files use the indirect 'struct proc_ops' to describe their file operations, and were fixed up to support splice earlier in commits 40be821d627c..b24c30c67863, but the mountinfo files interact with the VFS directly using their own 'struct file_operations' and got missed as a result. This adds the necessary support for splice to work for /proc/*/mountinfo and friends. Reported-by: Joan Bruguera Micó <joanbrugueram@gmail.com> Reported-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Link: https://bugzilla.kernel.org/show_bug.cgi?id=209971 Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-27Merge tag 'ntb-5.11' of git://github.com/jonmason/ntbLinus Torvalds
Pull NTB fixes from Jon Mason: "Bug fix for IDT NTB and Intel NTB LTR management support" * tag 'ntb-5.11' of git://github.com/jonmason/ntb: ntb: intel: add Intel NTB LTR vendor support for gen4 NTB ntb: idt: fix error check in ntb_hw_idt.c
2020-12-27Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "Fix a number of autobuild failures due to missing Kconfig dependencies" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: qat - add CRYPTO_AES to Kconfig dependencies crypto: keembay - Add dependency on HAS_IOMEM crypto: keembay - CRYPTO_DEV_KEEMBAY_OCS_AES_SM4 should depend on ARCH_KEEMBAY
2020-12-27Merge tag 'objtool-urgent-2020-12-27' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fix from Ingo Molnar: "Fix a segfault that occurs when built with Clang" * tag 'objtool-urgent-2020-12-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Fix seg fault with Clang non-section symbols
2020-12-27Merge tag 'locking-urgent-2020-12-27' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Misc fixes/updates: - Fix static keys usage in module __init sections - Add separate MAINTAINERS entry for static branches/calls - Fix lockdep splat with CONFIG_PREEMPTIRQ_EVENTS=y tracing" * tag 'locking-urgent-2020-12-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: softirq: Avoid bad tracing / lockdep interaction jump_label/static_call: Add MAINTAINERS jump_label: Fix usage in module __init
2020-12-27Merge tag 'timers-urgent-2020-12-27' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Ingo Molnar: "Update/fix two CPU sanity checks in the hotplug and the boot code, and fix a typo in the Kconfig help text. [ Context: the first two commits are the result of an ongoing annotation+review work of (intentional) tick_do_timer_cpu() data races reported by KCSAN, but the annotations aren't fully cooked yet ]" * tag 'timers-urgent-2020-12-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timekeeping: Fix spelling mistake in Kconfig "fullfill" -> "fulfill" tick/sched: Remove bogus boot "safety" check tick: Remove pointless cpu valid check in hotplug code