summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-02-14alarmtimer: Prevent starvation by small intervals and SIG_IGNThomas Gleixner
syzbot reported a RCU stall which is caused by setting up an alarmtimer with a very small interval and ignoring the signal. The reproducer arms the alarm timer with a relative expiry of 8ns and an interval of 9ns. Not a problem per se, but that's an issue when the signal is ignored because then the timer is immediately rearmed because there is no way to delay that rearming to the signal delivery path. See posix_timer_fn() and commit 58229a189942 ("posix-timers: Prevent softirq starvation by small intervals and SIG_IGN") for details. The reproducer does not set SIG_IGN explicitely, but it sets up the timers signal with SIGCONT. That has the same effect as explicitely setting SIG_IGN for a signal as SIGCONT is ignored if there is no handler set and the task is not ptraced. The log clearly shows that: [pid 5102] --- SIGCONT {si_signo=SIGCONT, si_code=SI_TIMER, si_timerid=0, si_overrun=316014, si_int=0, si_ptr=NULL} --- It works because the tasks are traced and therefore the signal is queued so the tracer can see it, which delays the restart of the timer to the signal delivery path. But then the tracer is killed: [pid 5087] kill(-5102, SIGKILL <unfinished ...> ... ./strace-static-x86_64: Process 5107 detached and after it's gone the stall can be observed: syzkaller login: [ 79.439102][ C0] hrtimer: interrupt took 68471 ns [ 184.460538][ C1] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: ... [ 184.658237][ C1] rcu: Stack dump where RCU GP kthread last ran: [ 184.664574][ C1] Sending NMI from CPU 1 to CPUs 0: [ 184.669821][ C0] NMI backtrace for cpu 0 [ 184.669831][ C0] CPU: 0 PID: 5108 Comm: syz-executor192 Not tainted 6.2.0-rc6-next-20230203-syzkaller #0 ... [ 184.670036][ C0] Call Trace: [ 184.670041][ C0] <IRQ> [ 184.670045][ C0] alarmtimer_fired+0x327/0x670 posix_timer_fn() prevents that by checking whether the interval for timers which have the signal ignored is smaller than a jiffie and artifically delay it by shifting the next expiry out by a jiffie. That's accurate vs. the overrun accounting, but slightly inaccurate vs. timer_gettimer(2). The comment in that function says what needs to be done and there was a fix available for the regular userspace induced SIG_IGN mechanism, but that did not work due to the implicit ignore for SIGCONT and similar signals. This needs to be worked on, but for now the only available workaround is to do exactly what posix_timer_fn() does: Increase the interval of self-rearming timers, which have their signal ignored, to at least a jiffie. Interestingly this has been fixed before via commit ff86bf0c65f1 ("alarmtimer: Rate limit periodic intervals") already, but that fix got lost in a later rework. Reported-by: syzbot+b9564ba6e8e00694511b@syzkaller.appspotmail.com Fixes: f2c45807d399 ("alarmtimer: Switch over to generic set/get/rearm routine") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: John Stultz <jstultz@google.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87k00q1no2.ffs@tglx
2023-02-14x86/mtrr: Revert 90b926e68f50 ("x86/pat: Fix pat_x_mtrr_type() for MTRR ↵Juergen Gross
disabled case") Commit 90b926e68f50 ("x86/pat: Fix pat_x_mtrr_type() for MTRR disabled case") broke the use case of running Xen dom0 kernels on machines with an external disk enclosure attached via USB, see Link tag. What this commit was originally fixing - SEV-SNP guests on Hyper-V - is a more specialized situation which has other issues at the moment anyway so reverting this now and addressing the issue properly later is the prudent thing to do. So revert it in time for the 6.2 proper release. [ bp: Rewrite commit message. ] Reported-by: Christian Kujau <lists@nerdbynature.de> Tested-by: Christian Kujau <lists@nerdbynature.de> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/4fe9541e-4d4c-2b2a-f8c8-2d34a7284930@nerdbynature.de
2023-02-14net: stmmac: Restrict warning on disabling DMA store and fwd modeCristian Ciocaltea
When setting 'snps,force_thresh_dma_mode' DT property, the following warning is always emitted, regardless the status of force_sf_dma_mode: dwmac-starfive 10020000.ethernet: force_sf_dma_mode is ignored if force_thresh_dma_mode is set. Do not print the rather misleading message when DMA store and forward mode is already disabled. Fixes: e2a240c7d3bc ("driver:net:stmmac: Disable DMA store and forward mode if platform data force_thresh_dma_mode is set.") Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com> Link: https://lore.kernel.org/r/20230210202126.877548-1-cristian.ciocaltea@collabora.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-14nvme-pci: always return an ERR_PTR from nvme_pci_alloc_devIrvin Cote
Don't mix NULL and ERR_PTR returns. Fixes: 2e87570be9d2 ("nvme-pci: factor out a nvme_pci_alloc_dev helper") Signed-off-by: Irvin Cote <irvin.cote@insa-lyon.fr> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-02-14nvme-pci: set the DMA mask earlierChristoph Hellwig
Set the DMA mask before calling dma_addressing_limited, which depends on it. Note that this stop checking the return value of dma_set_mask_and_coherent as this function can only fail for masks < 32-bit. Fixes: 3f30a79c2e2c ("nvme-pci: set constant paramters in nvme_pci_alloc_ctrl") Reported-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Michael Kelley <mikelley@microsoft.com>
2023-02-13net/sched: act_ctinfo: use percpu statsPedro Tammela
The tc action act_ctinfo was using shared stats, fix it to use percpu stats since bstats_update() must be called with locks or with a percpu pointer argument. tdc results: 1..12 ok 1 c826 - Add ctinfo action with default setting ok 2 0286 - Add ctinfo action with dscp ok 3 4938 - Add ctinfo action with valid cpmark and zone ok 4 7593 - Add ctinfo action with drop control ok 5 2961 - Replace ctinfo action zone and action control ok 6 e567 - Delete ctinfo action with valid index ok 7 6a91 - Delete ctinfo action with invalid index ok 8 5232 - List ctinfo actions ok 9 7702 - Flush ctinfo actions ok 10 3201 - Add ctinfo action with duplicate index ok 11 8295 - Add ctinfo action with invalid index ok 12 3964 - Replace ctinfo action with invalid goto_chain control Fixes: 24ec483cec98 ("net: sched: Introduce act_ctinfo action") Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20230210200824.444856-1-pctammela@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-13net: stmmac: fix order of dwmac5 FlexPPS parametrization sequenceJohannes Zink
So far changing the period by just setting new period values while running did not work. The order as indicated by the publicly available reference manual of the i.MX8MP [1] indicates a sequence: * initiate the programming sequence * set the values for PPS period and start time * start the pulse train generation. This is currently not used in dwmac5_flex_pps_config(), which instead does: * initiate the programming sequence and immediately start the pulse train generation * set the values for PPS period and start time This caused the period values written not to take effect until the FlexPPS output was disabled and re-enabled again. This patch fix the order and allows the period to be set immediately. [1] https://www.nxp.com/webapp/Download?colCode=IMX8MPRM Fixes: 9a8a02c9d46d ("net: stmmac: Add Flexible PPS support") Signed-off-by: Johannes Zink <j.zink@pengutronix.de> Link: https://lore.kernel.org/r/20230210143937.3427483-1-j.zink@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-14ata: pata_octeon_cf: drop kernel-doc notationRandy Dunlap
Fix a slew of kernel-doc warnings in pata_octeon_cf.c by changing all "/**" comments to "/*" since they are not in kernel-doc format. Fixes: 3c929c6f5aa7 ("libata: New driver for OCTEON SOC Compact Flash interface (v7).") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/all/202302101722.5O56RClE-lkp@intel.com/ Cc: David Daney <ddaney@caviumnetworks.com> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: linux-ide@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2023-02-14ata: ahci: Add Tiger Lake UP{3,4} AHCI controllerSimon Gaiser
Mark the Tiger Lake UP{3,4} AHCI controller as "low_power". This enables S0ix to work out of the box. Otherwise this isn't working unless the user manually sets /sys/class/scsi_host/*/link_power_management_policy. Intel lists a total of 4 SATA controller IDs in [1] for those mobile PCHs. This commit just adds the "AHCI" variant since I only tested those. [1]: https://cdrdv2.intel.com/v1/dl/getContent/631119 Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com> CC: stable@vger.kernel.org Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2023-02-14ata: libata-core: Disable READ LOG DMA EXT for Samsung MZ7LHPatrick McLean
Samsung MZ7LH drives are spewing messages like this in to dmesg with AMD SATA controllers: ata1.00: exception Emask 0x0 SAct 0x7e0000 SErr 0x0 action 0x6 frozen ata1.00: failed command: SEND FPDMA QUEUED ata1.00: cmd 64/01:88:00:00:00/00:00:00:00:00/a0 tag 17 ncq dma 512 out res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Since this was seen previously with SSD 840 EVO drives in https://bugzilla.kernel.org/show_bug.cgi?id=203475 let's add the same fix for these drives as the EVOs have, since they likely have very similar firmwares. Signed-off-by: Patrick McLean <chutzpah@gentoo.org> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2023-02-14mmc: jz4740: Work around bug on JZ4760(B)Paul Cercueil
On JZ4760 and JZ4760B, SD cards fail to run if the maximum clock rate is set to 50 MHz, even though the controller officially does support it. Until the actual bug is found and fixed, limit the maximum clock rate to 24 MHz. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230131210229.68129-1-paul@crapouillou.net Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-02-14mmc: mmc_spi: fix error handling in mmc_spi_probe()Yang Yingliang
If mmc_add_host() fails, it doesn't need to call mmc_remove_host(), or it will cause null-ptr-deref, because of deleting a not added device in mmc_remove_host(). To fix this, goto label 'fail_glue_init', if mmc_add_host() fails, and change the label 'fail_add_host' to 'fail_gpiod_request'. Fixes: 15a0580ced08 ("mmc_spi host driver") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/r/20230131013835.3564011-1-yangyingliang@huawei.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-02-14mmc: sdio: fix possible resource leaks in some error pathsYang Yingliang
If sdio_add_func() or sdio_init_func() fails, sdio_remove_func() can not release the resources, because the sdio function is not presented in these two cases, it won't call of_node_put() or put_device(). To fix these leaks, make sdio_func_present() only control whether device_del() needs to be called or not, then always call of_node_put() and put_device(). In error case in sdio_init_func(), the reference of 'card->dev' is not get, to avoid redundant put in sdio_free_func_cis(), move the get_device() to sdio_alloc_func() and put_device() to sdio_release_func(), it can keep the get/put function be balanced. Without this patch, while doing fault inject test, it can get the following leak reports, after this fix, the leak is gone. unreferenced object 0xffff888112514000 (size 2048): comm "kworker/3:2", pid 65, jiffies 4294741614 (age 124.774s) hex dump (first 32 bytes): 00 e0 6f 12 81 88 ff ff 60 58 8d 06 81 88 ff ff ..o.....`X...... 10 40 51 12 81 88 ff ff 10 40 51 12 81 88 ff ff .@Q......@Q..... backtrace: [<000000009e5931da>] kmalloc_trace+0x21/0x110 [<000000002f839ccb>] mmc_alloc_card+0x38/0xb0 [mmc_core] [<0000000004adcbf6>] mmc_sdio_init_card+0xde/0x170 [mmc_core] [<000000007538fea0>] mmc_attach_sdio+0xcb/0x1b0 [mmc_core] [<00000000d4fdeba7>] mmc_rescan+0x54a/0x640 [mmc_core] unreferenced object 0xffff888112511000 (size 2048): comm "kworker/3:2", pid 65, jiffies 4294741623 (age 124.766s) hex dump (first 32 bytes): 00 40 51 12 81 88 ff ff e0 58 8d 06 81 88 ff ff .@Q......X...... 10 10 51 12 81 88 ff ff 10 10 51 12 81 88 ff ff ..Q.......Q..... backtrace: [<000000009e5931da>] kmalloc_trace+0x21/0x110 [<00000000fcbe706c>] sdio_alloc_func+0x35/0x100 [mmc_core] [<00000000c68f4b50>] mmc_attach_sdio.cold.18+0xb1/0x395 [mmc_core] [<00000000d4fdeba7>] mmc_rescan+0x54a/0x640 [mmc_core] Fixes: 3d10a1ba0d37 ("sdio: fix reference counting in sdio_remove_func()") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230130125808.3471254-1-yangyingliang@huawei.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-02-13mmc: meson-gx: fix SDIO mode if cap_sdio_irq isn't setHeiner Kallweit
Some SDIO WiFi modules stopped working after SDIO interrupt mode was added if cap_sdio_irq isn't set in device tree. This patch was confirmed to fix the issue. Fixes: 066ecde6d826 ("mmc: meson-gx: add SDIO interrupt support") Reported-by: Geraldo Nascimento <geraldogabriel@gmail.com> Tested-by: Geraldo Nascimento <geraldogabriel@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/816cba9f-ff92-31a2-60f0-aca542d1d13e@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-02-13Merge tag 'mm-hotfixes-stable-2023-02-13-13-50' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "Twelve hotfixes, mostly against mm/. Five of these fixes are cc:stable" * tag 'mm-hotfixes-stable-2023-02-13-13-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: of: reserved_mem: Have kmemleak ignore dynamically allocated reserved mem scripts/gdb: fix 'lx-current' for x86 lib: parser: optimize match_NUMBER apis to use local array mm: shrinkers: fix deadlock in shrinker debugfs mm: hwpoison: support recovery from ksm_might_need_to_copy() kasan: fix Oops due to missing calls to kasan_arch_is_ready() revert "squashfs: harden sanity check in squashfs_read_xattr_id_table" fsdax: dax_unshare_iter() should return a valid length mm/gup: add folio to list when folio_isolate_lru() succeed aio: fix mremap after fork null-deref mailmap: add entry for Alexander Mikhalitsyn mm: extend max struct page size for kmsan
2023-02-13ice: fix lost multicast packets in promisc modeJesse Brandeburg
There was a problem reported to us where the addition of a VF with an IPv6 address ending with a particular sequence would cause the parent device on the PF to no longer be able to respond to neighbor discovery packets. In this case, we had an ovs-bridge device living on top of a VLAN, which was on top of a PF, and it would not be able to talk anymore (the neighbor entry would expire and couldn't be restored). The root cause of the issue is that if the PF is asked to be in IFF_PROMISC mode (promiscuous mode) and it had an ipv6 address that needed the 33:33:ff:00:00:04 multicast address to work, then when the VF was added with the need for the same multicast address, the VF would steal all the traffic destined for that address. The ice driver didn't auto-subscribe a request of IFF_PROMISC to the "multicast replication from other port's traffic" meaning that it won't get for instance, packets with an exact destination in the VF, as above. The VF's IPv6 address, which adds a "perfect filter" for 33:33:ff:00:00:04, results in no packets for that multicast address making it to the PF (which is in promisc but NOT "multicast replication"). The fix is to enable "multicast promiscuous" whenever the driver is asked to enable IFF_PROMISC, and make sure to disable it when appropriate. Fixes: e94d44786693 ("ice: Implement filter sync, NDO operations and bump version") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-02-13ice: Fix check for weight and priority of a scheduling nodeMichal Wilczynski
Currently checks for weight and priority ranges don't check incoming value from the devlink. Instead it checks node current weight or priority. This makes those checks useless. Change range checks in ice_set_object_tx_priority() and ice_set_object_tx_weight() to check against incoming priority an weight. Fixes: 42c2eb6b1f43 ("ice: Implement devlink-rate API") Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-02-13Merge tag 'platform-drivers-x86-v6.2-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform drivers fix from Hans de Goede: "Intel vsec driver Meteor Lake PCI ids addition" * tag 'platform-drivers-x86-v6.2-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/intel/vsec: Add support for Meteor Lake
2023-02-13drm: Disable dynamic debug as brokenVille Syrjälä
CONFIG_DRM_USE_DYNAMIC_DEBUG breaks debug prints for (at least modular) drm drivers. The debug prints can be reinstated by manually frobbing /sys/module/drm/parameters/debug after the fact, but at that point the damage is done and all debugs from driver probe are lost. This makes drivers totally undebuggable. There's a more complete fix in progress [1], with further details, but we need this fixed in stable kernels. Mark the feature as broken and disable it by default, with hopes distros follow suit and disable it as well. [1] https://lore.kernel.org/r/20230125203743.564009-1-jim.cromie@gmail.com Fixes: 84ec67288c10 ("drm_print: wrap drm_*_dbg in dyndbg descriptor factory macro") Cc: Jim Cromie <jim.cromie@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.1+ Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Jim Cromie <jim.cromie@gmail.com> Acked-by: Maxime Ripard <maxime@cerno.tech> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230207143337.2126678-1-jani.nikula@intel.com
2023-02-13sched/core: Fix a missed update of user_cpus_ptrWaiman Long
Since commit 8f9ea86fdf99 ("sched: Always preserve the user requested cpumask"), a successful call to sched_setaffinity() should always save the user requested cpu affinity mask in a task's user_cpus_ptr. However, when the given cpu mask is the same as the current one, user_cpus_ptr is not updated. Fix this by saving the user mask in this case too. Fixes: 8f9ea86fdf99 ("sched: Always preserve the user requested cpumask") Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230203181849.221943-1-longman@redhat.com
2023-02-13freezer,umh: Fix call_usermode_helper_exec() vs SIGKILLPeter Zijlstra
Tetsuo-San noted that commit f5d39b020809 ("freezer,sched: Rewrite core freezer logic") broke call_usermodehelper_exec() for the KILLABLE case. Specifically it was missed that the second, unconditional, wait_for_completion() was not optional and ensures the on-stack completion is unused before going out-of-scope. Fixes: f5d39b020809 ("freezer,sched: Rewrite core freezer logic") Reported-by: syzbot+6cd18e123583550cf469@syzkaller.appspotmail.com Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Debugged-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/Y90ar35uKQoUrLEK@hirez.programming.kicks-ass.net
2023-02-13PCI/MSI: Provide missing stubs for CONFIG_PCI_MSI=nReinette Chatre
pci_msix_alloc_irq_at() and pci_msix_free_irq() are not declared when CONFIG_PCI_MSI is disabled. Users of these two calls do not yet exist but when users do appear (shown below is an attempt to use the new API in vfio-pci) the following errors will be encountered when compiling with CONFIG_PCI_MSI disabled: drivers/vfio/pci/vfio_pci_intrs.c:461:4: error: implicit declaration of\ function 'pci_msix_free_irq' is invalid in C99\ [-Werror,-Wimplicit-function-declaration] pci_msix_free_irq(pdev, msix_map); ^ drivers/vfio/pci/vfio_pci_intrs.c:511:15: error: implicit declaration of\ function 'pci_msix_alloc_irq_at' is invalid in C99\ [-Werror,-Wimplicit-function-declaration] msix_map = pci_msix_alloc_irq_at(pdev, vector, NULL); Provide definitions for pci_msix_alloc_irq_at() and pci_msix_free_irq() in preparation for users that need to compile when CONFIG_PCI_MSI is disabled. Reported-by: kernel test robot <lkp@intel.com> Fixes: 34026364df8e ("PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X") Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/158e40e1cfcfc58ae30ecb2bbfaf86e5bba7a1ef.1675978686.git.reinette.chatre@intel.com
2023-02-13bnxt_en: Fix mqprio and XDP ring checking logicMichael Chan
In bnxt_reserve_rings(), there is logic to check that the number of TX rings reserved is enough to cover all the mqprio TCs, but it fails to account for the TX XDP rings. So the check will always fail if there are mqprio TCs and TX XDP rings. As a result, the driver always fails to initialize after the XDP program is attached and the device will be brought down. A subsequent ifconfig up will also fail because the number of TX rings is set to an inconsistent number. Fix the check to properly account for TX XDP rings. If the check fails, set the number of TX rings back to a consistent number after calling netdev_reset_tc(). Fixes: 674f50a5b026 ("bnxt_en: Implement new method to reserve rings.") Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-13net: Fix unwanted sign extension in netdev_stats_to_stats64()Felix Riemann
When converting net_device_stats to rtnl_link_stats64 sign extension is triggered on ILP32 machines as 6c1c509778 changed the previous "ulong -> u64" conversion to "long -> u64" by accessing the net_device_stats fields through a (signed) atomic_long_t. This causes for example the received bytes counter to jump to 16EiB after having received 2^31 bytes. Casting the atomic value to "unsigned long" beforehand converting it into u64 avoids this. Fixes: 6c1c5097781f ("net: add atomic_long_t to net_device_stats fields") Signed-off-by: Felix Riemann <felix.riemann@sma.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-13net/usb: kalmia: Don't pass act_len in usb_bulk_msg error pathMiko Larsson
syzbot reported that act_len in kalmia_send_init_packet() is uninitialized when passing it to the first usb_bulk_msg error path. Jiri Pirko noted that it's pointless to pass it in the error path, and that the value that would be printed in the second error path would be the value of act_len from the first call to usb_bulk_msg.[1] With this in mind, let's just not pass act_len to the usb_bulk_msg error paths. 1: https://lore.kernel.org/lkml/Y9pY61y1nwTuzMOa@nanopsycho/ Fixes: d40261236e8e ("net/usb: Add Samsung Kalmia driver for Samsung GT-B3730") Reported-and-tested-by: syzbot+cd80c5ef5121bfe85b55@syzkaller.appspotmail.com Signed-off-by: Miko Larsson <mikoxyzzz@gmail.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-13net: openvswitch: fix possible memory leak in ovs_meter_cmd_set()Hangyu Hua
old_meter needs to be free after it is detached regardless of whether the new meter is successfully attached. Fixes: c7c4c44c9a95 ("net: openvswitch: expand the meters supported number") Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-13af_key: Fix heap information leakHyunwoo Kim
Since x->encap of pfkey_msg2xfrm_state() is not initialized to 0, kernel heap data can be leaked. Fix with kzalloc() to prevent this. Signed-off-by: Hyunwoo Kim <v4bel@theori.io> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-13ALSA: hda/realtek - fixed wrong gpio assignedKailang Yang
GPIO2 PIN use for output. Mask Dir and Data need to assign for 0x4. Not 0x3. This fixed was for Lenovo Desktop(0x17aa1056). GPIO2 use for AMP enable. Signed-off-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/8d02bb9ac8134f878cd08607fdf088fd@realtek.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-02-13nvme-pci: add bogus ID quirk for ADATA SX6000PNPDaniel Wagner
Yet another device which needs a quirk: nvme nvme1: globally duplicate IDs for nsid 1 nvme nvme1: VID:DID 10ec:5763 model:ADATA SX6000PNP firmware:V9002s94 Link: http://bugzilla.opensuse.org/show_bug.cgi?id=1207827 Reported-by: Gustavo Freitas <freitasmgustavo@gmail.com> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2023-02-12tracing: Make trace_define_field_ext() staticSteven Rostedt (Google)
trace_define_field_ext() is not used outside of trace_events.c, it should be static. Link: https://lore.kernel.org/oe-kbuild-all/202302130750.679RaRog-lkp@intel.com/ Fixes: b6c7abd1c28a ("tracing: Fix TASK_COMM_LEN in trace event format file") Reported-by: Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-02-12Linux 6.2-rc8v6.2-rc8Linus Torvalds
2023-02-12MAINTAINERS: Add myself as maintainer for arch/sh (SUPERH)John Paul Adrian Glaubitz
Both Rich Felker and Yoshinori Sato haven't done any work on arch/sh for a while. As I have been maintaining Debian's sh4 port since 2014, I am interested to keep the architecture alive. Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Acked-by: Yoshinori Sato <ysato@users.sourceforge.jp> Acked-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-02-12Merge tag 'trace-v6.2-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fix from Steven Rostedt: "Fix showing of TASK_COMM_LEN instead of its value The TASK_COMM_LEN was converted from a macro into an enum so that BTF would have access to it. But this unfortunately caused TASK_COMM_LEN to display in the format fields of trace events, as they are created by the TRACE_EVENT() macro and such, macros convert to their values, where as enums do not. To handle this, instead of using the field itself to be display, save the value of the array size as another field in the trace_event_fields structure, and use that instead. Not only does this fix the issue, but also converts the other trace events that have this same problem (but were not breaking tooling). With this change, the original work around b3bc8547d3be6 ("tracing: Have TRACE_DEFINE_ENUM affect trace event types as well") could be reverted (but that should be done in the merge window)" * tag 'trace-v6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Fix TASK_COMM_LEN in trace event format file
2023-02-12Merge tag 'for-6.2-rc7-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - one more fix for a tree-log 'write time corruption' report, update the last dir index directly and don't keep in the log context - do VFS-level inode lock around FIEMAP to prevent a deadlock with concurrent fsync, the extent-level lock is not sufficient - don't cache a single-device filesystem device to avoid cases when a loop device is reformatted and the entry gets stale * tag 'for-6.2-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: free device in btrfs_close_devices for a single device filesystem btrfs: lock the inode in shared mode before starting fiemap btrfs: simplify update of last_dir_index_offset when logging a directory
2023-02-12Merge tag 'usb-6.2-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are 2 small USB driver fixes that resolve some reported regressions and one new device quirk. Specifically these are: - new quirk for Alcor Link AK9563 smartcard reader - revert of u_ether gadget change in 6.2-rc1 that caused problems - typec pin probe fix All of these have been in linux-next with no reported problems" * tag 'usb-6.2-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: core: add quirk for Alcor Link AK9563 smartcard reader usb: typec: altmodes/displayport: Fix probe pin assign check Revert "usb: gadget: u_ether: Do not make UDC parent of the net device"
2023-02-12Merge tag 'efi-fixes-for-v6.2-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fix from Ard Biesheuvel: "A fix from Darren to widen the SMBIOS match for detecting Ampere Altra machines with problematic firmware. In the mean time, we are working on a more precise check, but this is still work in progress" * tag 'efi-fixes-for-v6.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: arm64: efi: Force the use of SetVirtualAddressMap() on eMAG and Altra Max machines
2023-02-12Merge tag 'powerpc-6.2-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix interrupt exit race with security mitigation switching. - Don't select ARCH_WANTS_NO_INSTR until warnings are fixed. - Build fix for CONFIG_NUMA=n. Thanks to Nicholas Piggin, Randy Dunlap, and Sachin Sant. * tag 'powerpc-6.2-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s/interrupt: Fix interrupt exit race with security mitigation switch powerpc/kexec_file: fix implicit decl error powerpc: Don't select ARCH_WANTS_NO_INSTR
2023-02-12Fix page corruption caused by racy check in __free_pagesDavid Chen
When we upgraded our kernel, we started seeing some page corruption like the following consistently: BUG: Bad page state in process ganesha.nfsd pfn:1304ca page:0000000022261c55 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0 pfn:0x1304ca flags: 0x17ffffc0000000() raw: 0017ffffc0000000 ffff8a513ffd4c98 ffffeee24b35ec08 0000000000000000 raw: 0000000000000000 0000000000000001 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount CPU: 0 PID: 15567 Comm: ganesha.nfsd Kdump: loaded Tainted: P B O 5.10.158-1.nutanix.20221209.el7.x86_64 #1 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/05/2016 Call Trace: dump_stack+0x74/0x96 bad_page.cold+0x63/0x94 check_new_page_bad+0x6d/0x80 rmqueue+0x46e/0x970 get_page_from_freelist+0xcb/0x3f0 ? _cond_resched+0x19/0x40 __alloc_pages_nodemask+0x164/0x300 alloc_pages_current+0x87/0xf0 skb_page_frag_refill+0x84/0x110 ... Sometimes, it would also show up as corruption in the free list pointer and cause crashes. After bisecting the issue, we found the issue started from commit e320d3012d25 ("mm/page_alloc.c: fix freeing non-compound pages"): if (put_page_testzero(page)) free_the_page(page, order); else if (!PageHead(page)) while (order-- > 0) free_the_page(page + (1 << order), order); So the problem is the check PageHead is racy because at this point we already dropped our reference to the page. So even if we came in with compound page, the page can already be freed and PageHead can return false and we will end up freeing all the tail pages causing double free. Fixes: e320d3012d25 ("mm/page_alloc.c: fix freeing non-compound pages") Link: https://lore.kernel.org/lkml/BYAPR02MB448855960A9656EEA81141FC94D99@BYAPR02MB4488.namprd02.prod.outlook.com/ Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@vger.kernel.org Signed-off-by: Chunwei Chen <david.chen@nutanix.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-02-12tracing: Fix TASK_COMM_LEN in trace event format fileYafang Shao
After commit 3087c61ed2c4 ("tools/testing/selftests/bpf: replace open-coded 16 with TASK_COMM_LEN"), the content of the format file under /sys/kernel/tracing/events/task/task_newtask was changed from field:char comm[16]; offset:12; size:16; signed:0; to field:char comm[TASK_COMM_LEN]; offset:12; size:16; signed:0; John reported that this change breaks older versions of perfetto. Then Mathieu pointed out that this behavioral change was caused by the use of __stringify(_len), which happens to work on macros, but not on enum labels. And he also gave the suggestion on how to fix it: :One possible solution to make this more robust would be to extend :struct trace_event_fields with one more field that indicates the length :of an array as an actual integer, without storing it in its stringified :form in the type, and do the formatting in f_show where it belongs. The result as follows after this change, $ cat /sys/kernel/tracing/events/task/task_newtask/format field:char comm[16]; offset:12; size:16; signed:0; Link: https://lore.kernel.org/lkml/Y+QaZtz55LIirsUO@google.com/ Link: https://lore.kernel.org/linux-trace-kernel/20230210155921.4610-1-laoar.shao@gmail.com/ Link: https://lore.kernel.org/linux-trace-kernel/20230212151303.12353-1-laoar.shao@gmail.com Cc: stable@vger.kernel.org Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Kajetan Puchalski <kajetan.puchalski@arm.com> CC: Qais Yousef <qyousef@layalina.io> Fixes: 3087c61ed2c4 ("tools/testing/selftests/bpf: replace open-coded 16 with TASK_COMM_LEN") Reported-by: John Stultz <jstultz@google.com> Debugged-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-02-11Merge tag 'spi-fix-v6.2-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A couple of hopefully final fixes for spi: one driver specific fix for an issue with very large transfers and a fix for an issue with the locking fixes in spidev merged earlier this release cycle which was missed" * tag 'spi-fix-v6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spidev: fix a recursive locking error spi: dw: Fix wrong FIFO level setting for long xfers
2023-02-11Merge tag 'x86-urgent-2023-02-11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Fix a kprobes bug, plus add a new Intel model number to the upstream <asm/intel-family.h> header for drivers to use" * tag 'x86-urgent-2023-02-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Add Lunar Lake M x86/kprobes: Fix 1 byte conditional jump target
2023-02-11Merge tag 'locking-urgent-2023-02-11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Ingo Molnar: "Fix an rtmutex missed-wakeup bug" * tag 'locking-urgent-2023-02-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rtmutex: Ensure that the top waiter is always woken up
2023-02-11Merge tag 'cxl-fixes-6.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl fixes from Dan Williams: "Two fixups for CXL (Compute Express Link) in presence of passthrough decoders. This primarily helps developers using the QEMU CXL emulation, but with the impending arrival of CXL switches these types of topologies will be of interest to end users. - Fix a crash when shutting down regions in the presence of passthrough decoders - Fix region creation to understand passthrough decoders instead of the narrower definition of passthrough ports" * tag 'cxl-fixes-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/region: Fix passthrough-decoder detection cxl/region: Fix null pointer dereference for resetting decoder
2023-02-11Merge tag 'libnvdimm-fixes-6.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "A fix for an issue that could causes users to inadvertantly reserve too much capacity when debugging the KMSAN and persistent memory namespace, a lockdep fix, and a kernel-doc build warning: - Resolve the conflict between KMSAN and NVDIMM with respect to reserving pmem namespace / volume capacity for larger sizeof(struct page) - Fix a lockdep warning in the the NFIT code - Fix a kernel-doc build warning" * tag 'libnvdimm-fixes-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: nvdimm: Support sizeof(struct page) > MAX_STRUCT_PAGE_SIZE ACPI: NFIT: fix a potential deadlock during NFIT teardown dax: super.c: fix kernel-doc bad line warning
2023-02-11Merge tag 'fixes-2023-02-11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock revert from Mike Rapoport: "Revert 'mm: Always release pages to the buddy allocator in memblock_free_late()' The pages being freed by memblock_free_late() have already been initialized, but if they are in the deferred init range, __free_one_page() might access nearby uninitialized pages when trying to coalesce buddies, which will cause a crash. A proper fix will be more involved so revert this change for the time being" * tag 'fixes-2023-02-11' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: Revert "mm: Always release pages to the buddy allocator in memblock_free_late()."
2023-02-11nfsd: don't destroy global nfs4_file table in per-net shutdownJeff Layton
The nfs4_file table is global, so shutting it down when a containerized nfsd is shut down is wrong and can lead to double-frees. Tear down the nfs4_file_rhltable in nfs4_state_shutdown instead of nfs4_state_shutdown_net. Fixes: d47b295e8d76 ("NFSD: Use rhashtable for managing nfs4_file objects") Link: https://bugzilla.redhat.com/show_bug.cgi?id=2169017 Reported-by: JianHong Yin <jiyin@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-11ALSA: hda: Fix codec device field initializanCezary Rojewski
Commit f2bd1c5ae2cb ("ALSA: hda: Fix page fault in snd_hda_codec_shutdown()") relocated initialization of several codec device fields. Due to differences between codec_exec_verb() and snd_hdac_bus_exec_bus() in how they handle VERB execution - the latter does not touch PM - assigning ->exec_verb to codec_exec_verb() causes PM to be engaged before it is configured for the device. Configuration of PM for the ASoC HDAudio sound card is done with snd_hda_set_power_save() during skl_hda_audio_probe() whereas the assignment happens early, in snd_hda_codec_device_init(). Revert to previous behavior to avoid problems caused by too early PM manipulation. Suggested-by: Jason Montleon <jmontleo@redhat.com> Link: https://lore.kernel.org/regressions/CALFERdzKUodLsm6=Ub3g2+PxpNpPtPq3bGBLbff=eZr9_S=YVA@mail.gmail.com Fixes: f2bd1c5ae2cb ("ALSA: hda: Fix page fault in snd_hda_codec_shutdown()") Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com> Link: https://lore.kernel.org/r/20230210165541.3543604-1-cezary.rojewski@intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-02-10Merge branch 'sk-sk_forward_alloc-fixes'Jakub Kicinski
Kuniyuki Iwashima says: ==================== sk->sk_forward_alloc fixes. The first patch fixes a negative sk_forward_alloc by adding sk_rmem_schedule() before skb_set_owner_r(), and second patch removes an unnecessary WARN_ON_ONCE(). v2: https://lore.kernel.org/netdev/20230209013329.87879-1-kuniyu@amazon.com/ v1: https://lore.kernel.org/netdev/20230207183718.54520-1-kuniyu@amazon.com/ ==================== Link: https://lore.kernel.org/r/20230210002202.81442-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-10net: Remove WARN_ON_ONCE(sk->sk_forward_alloc) from sk_stream_kill_queues().Kuniyuki Iwashima
Christoph Paasch reported that commit b5fc29233d28 ("inet6: Remove inet6_destroy_sock() in sk->sk_prot->destroy().") started triggering WARN_ON_ONCE(sk->sk_forward_alloc) in sk_stream_kill_queues(). [0 - 2] Also, we can reproduce it by a program in [3]. In the commit, we delay freeing ipv6_pinfo.pktoptions from sk->destroy() to sk->sk_destruct(), so sk->sk_forward_alloc is no longer zero in inet_csk_destroy_sock(). The same check has been in inet_sock_destruct() from at least v2.6, we can just remove the WARN_ON_ONCE(). However, among the users of sk_stream_kill_queues(), only CAIF is not calling inet_sock_destruct(). Thus, we add the same WARN_ON_ONCE() to caif_sock_destructor(). [0]: https://lore.kernel.org/netdev/39725AB4-88F1-41B3-B07F-949C5CAEFF4F@icloud.com/ [1]: https://github.com/multipath-tcp/mptcp_net-next/issues/341 [2]: WARNING: CPU: 0 PID: 3232 at net/core/stream.c:212 sk_stream_kill_queues+0x2f9/0x3e0 Modules linked in: CPU: 0 PID: 3232 Comm: syz-executor.0 Not tainted 6.2.0-rc5ab24eb4698afbe147b424149c529e2a43ec24eb5 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:sk_stream_kill_queues+0x2f9/0x3e0 Code: 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e ec 00 00 00 8b ab 08 01 00 00 e9 60 ff ff ff e8 d0 5f b6 fe 0f 0b eb 97 e8 c7 5f b6 fe <0f> 0b eb a0 e8 be 5f b6 fe 0f 0b e9 6a fe ff ff e8 02 07 e3 fe e9 RSP: 0018:ffff88810570fc68 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff888101f38f40 RSI: ffffffff8285e529 RDI: 0000000000000005 RBP: 0000000000000ce0 R08: 0000000000000005 R09: 0000000000000000 R10: 0000000000000ce0 R11: 0000000000000001 R12: ffff8881009e9488 R13: ffffffff84af2cc0 R14: 0000000000000000 R15: ffff8881009e9458 FS: 00007f7fdfbd5800(0000) GS:ffff88811b600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b32923000 CR3: 00000001062fc006 CR4: 0000000000170ef0 Call Trace: <TASK> inet_csk_destroy_sock+0x1a1/0x320 __tcp_close+0xab6/0xe90 tcp_close+0x30/0xc0 inet_release+0xe9/0x1f0 inet6_release+0x4c/0x70 __sock_release+0xd2/0x280 sock_close+0x15/0x20 __fput+0x252/0xa20 task_work_run+0x169/0x250 exit_to_user_mode_prepare+0x113/0x120 syscall_exit_to_user_mode+0x1d/0x40 do_syscall_64+0x48/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7f7fdf7ae28d Code: c1 20 00 00 75 10 b8 03 00 00 00 0f 05 48 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ee fb ff ff 48 89 04 24 b8 03 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 37 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01 RSP: 002b:00000000007dfbb0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003 RAX: 0000000000000000 RBX: 0000000000000004 RCX: 00007f7fdf7ae28d RDX: 0000000000000000 RSI: ffffffffffffffff RDI: 0000000000000003 RBP: 0000000000000000 R08: 000000007f338e0f R09: 0000000000000e0f R10: 000000007f338e13 R11: 0000000000000293 R12: 00007f7fdefff000 R13: 00007f7fdefffcd8 R14: 00007f7fdefffce0 R15: 00007f7fdefffcd8 </TASK> [3]: https://lore.kernel.org/netdev/20230208004245.83497-1-kuniyu@amazon.com/ Fixes: b5fc29233d28 ("inet6: Remove inet6_destroy_sock() in sk->sk_prot->destroy().") Reported-by: syzbot <syzkaller@googlegroups.com> Reported-by: Christoph Paasch <christophpaasch@icloud.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-10dccp/tcp: Avoid negative sk_forward_alloc by ipv6_pinfo.pktoptions.Kuniyuki Iwashima
Eric Dumazet pointed out [0] that when we call skb_set_owner_r() for ipv6_pinfo.pktoptions, sk_rmem_schedule() has not been called, resulting in a negative sk_forward_alloc. We add a new helper which clones a skb and sets its owner only when sk_rmem_schedule() succeeds. Note that we move skb_set_owner_r() forward in (dccp|tcp)_v6_do_rcv() because tcp_send_synack() can make sk_forward_alloc negative before ipv6_opt_accepted() in the crossed SYN-ACK or self-connect() cases. [0]: https://lore.kernel.org/netdev/CANn89iK9oc20Jdi_41jb9URdF210r7d1Y-+uypbMSbOfY6jqrg@mail.gmail.com/ Fixes: 323fbd0edf3f ("net: dccp: Add handling of IPV6_PKTOPTIONS to dccp_v6_do_rcv()") Fixes: 3df80d9320bc ("[DCCP]: Introduce DCCPv6") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>