summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-06-03igc: add correct exception tracing for XDPMagnus Karlsson
Add missing exception tracing to XDP when a number of different errors can occur. The support was only partial. Several errors where not logged which would confuse the user quite a lot not knowing where and why the packets disappeared. Fixes: 73f1071c1d29 ("igc: Add support for XDP_TX action") Fixes: 4ff320361092 ("igc: Add support for XDP_REDIRECT action") Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: Dvora Fuxbrumer <dvorax.fuxbrumer@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-03ixgbevf: add correct exception tracing for XDPMagnus Karlsson
Add missing exception tracing to XDP when a number of different errors can occur. The support was only partial. Several errors where not logged which would confuse the user quite a lot not knowing where and why the packets disappeared. Fixes: 21092e9ce8b1 ("ixgbevf: Add support for XDP_TX action") Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: Vishakha Jambekar <vishakha.jambekar@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-03igb: add correct exception tracing for XDPMagnus Karlsson
Add missing exception tracing to XDP when a number of different errors can occur. The support was only partial. Several errors where not logged which would confuse the user quite a lot not knowing where and why the packets disappeared. Fixes: 9cbc948b5a20 ("igb: add XDP support") Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: Vishakha Jambekar <vishakha.jambekar@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-03ixgbe: add correct exception tracing for XDPMagnus Karlsson
Add missing exception tracing to XDP when a number of different errors can occur. The support was only partial. Several errors where not logged which would confuse the user quite a lot not knowing where and why the packets disappeared. Fixes: 33fdc82f0883 ("ixgbe: add support for XDP_TX action") Fixes: d0bcacd0a130 ("ixgbe: add AF_XDP zero-copy Rx support") Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: Vishakha Jambekar <vishakha.jambekar@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-03ice: add correct exception tracing for XDPMagnus Karlsson
Add missing exception tracing to XDP when a number of different errors can occur. The support was only partial. Several errors where not logged which would confuse the user quite a lot not knowing where and why the packets disappeared. Fixes: efc2214b6047 ("ice: Add support for XDP") Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-03i40e: add correct exception tracing for XDPMagnus Karlsson
Add missing exception tracing to XDP when a number of different errors can occur. The support was only partial. Several errors where not logged which would confuse the user quite a lot not knowing where and why the packets disappeared. Fixes: 74608d17fe29 ("i40e: add support for XDP_TX action") Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support") Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-03igb: Fix XDP with PTP enabledKurt Kanzenbach
When using native XDP with the igb driver, the XDP frame data doesn't point to the beginning of the packet. It's off by 16 bytes. Everything works as expected with XDP skb mode. Actually these 16 bytes are used to store the packet timestamps. Therefore, pull the timestamp before executing any XDP operations and adjust all other code accordingly. The igc driver does it like that as well. Tested with Intel i210 card and AF_XDP sockets. Fixes: 9cbc948b5a20 ("igb: add XDP support") Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-03x86/alternative: Optimize single-byte NOPs at an arbitrary positionBorislav Petkov
Up until now the assumption was that an alternative patching site would have some instructions at the beginning and trailing single-byte NOPs (0x90) padding. Therefore, the patching machinery would go and optimize those single-byte NOPs into longer ones. However, this assumption is broken on 32-bit when code like hv_do_hypercall() in hyperv_init() would use the ratpoline speculation killer CALL_NOSPEC. The 32-bit version of that macro would align certain insns to 16 bytes, leading to the compiler issuing a one or more single-byte NOPs, depending on the holes it needs to fill for alignment. That would lead to the warning in optimize_nops() to fire: ------------[ cut here ]------------ Not a NOP at 0xc27fb598 WARNING: CPU: 0 PID: 0 at arch/x86/kernel/alternative.c:211 optimize_nops.isra.13 due to that function verifying whether all of the following bytes really are single-byte NOPs. Therefore, carve out the NOP padding into a separate function and call it for each NOP range beginning with a single-byte NOP. Fixes: 23c1ad538f4f ("x86/alternatives: Optimize optimize_nops()") Reported-by: Richard Narron <richard@aaazen.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://bugzilla.kernel.org/show_bug.cgi?id=213301 Link: https://lkml.kernel.org/r/20210601212125.17145-1-bp@alien8.de
2021-06-03x86/cpufeatures: Force disable X86_FEATURE_ENQCMD and remove update_pasid()Thomas Gleixner
While digesting the XSAVE-related horrors which got introduced with the supervisor/user split, the recent addition of ENQCMD-related functionality got on the radar and turned out to be similarly broken. update_pasid(), which is only required when X86_FEATURE_ENQCMD is available, is invoked from two places: 1) From switch_to() for the incoming task 2) Via a SMP function call from the IOMMU/SMV code #1 is half-ways correct as it hacks around the brokenness of get_xsave_addr() by enforcing the state to be 'present', but all the conditionals in that code are completely pointless for that. Also the invocation is just useless overhead because at that point it's guaranteed that TIF_NEED_FPU_LOAD is set on the incoming task and all of this can be handled at return to user space. #2 is broken beyond repair. The comment in the code claims that it is safe to invoke this in an IPI, but that's just wishful thinking. FPU state of a running task is protected by fregs_lock() which is nothing else than a local_bh_disable(). As BH-disabled regions run usually with interrupts enabled the IPI can hit a code section which modifies FPU state and there is absolutely no guarantee that any of the assumptions which are made for the IPI case is true. Also the IPI is sent to all CPUs in mm_cpumask(mm), but the IPI is invoked with a NULL pointer argument, so it can hit a completely unrelated task and unconditionally force an update for nothing. Worse, it can hit a kernel thread which operates on a user space address space and set a random PASID for it. The offending commit does not cleanly revert, but it's sufficient to force disable X86_FEATURE_ENQCMD and to remove the broken update_pasid() code to make this dysfunctional all over the place. Anything more complex would require more surgery and none of the related functions outside of the x86 core code are blatantly wrong, so removing those would be overkill. As nothing enables the PASID bit in the IA32_XSS MSR yet, which is required to make this actually work, this cannot result in a regression except for related out of tree train-wrecks, but they are broken already today. Fixes: 20f0afd1fb3d ("x86/mmu: Allocate/free a PASID") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/87mtsd6gr9.ffs@nanos.tec.linutronix.de
2021-06-03dmaengine: idxd: Use cpu_feature_enabled()Borislav Petkov
When testing x86 feature bits, use cpu_feature_enabled() so that build-disabled features can remain off, regardless of what CPUID says. Fixes: 8e50d392652f ("dmaengine: idxd: Add shared workqueue support") Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-By: Vinod Koul <vkoul@kernel.org> Cc: <stable@vger.kernel.org>
2021-06-03NFSv4: Fix second deadlock in nfs4_evict_inode()Trond Myklebust
If the inode is being evicted but has to return a layout first, then that too can cause a deadlock in the corner case where the server reboots. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-03NFSv4: Fix deadlock between nfs4_evict_inode() and nfs4_opendata_get_inode()Trond Myklebust
If the inode is being evicted, but has to return a delegation first, then it can cause a deadlock in the corner case where the server reboots before the delegreturn completes, but while the call to iget5_locked() in nfs4_opendata_get_inode() is waiting for the inode free to complete. Since the open call still holds a session slot, the reboot recovery cannot proceed. In order to break the logjam, we can turn the delegation return into a privileged operation for the case where we're evicting the inode. We know that in that case, there can be no other state recovery operation that conflicts. Reported-by: zhangxiaoxu (A) <zhangxiaoxu5@huawei.com> Fixes: 5fcdfacc01f3 ("NFSv4: Return delegations synchronously in evict_inode") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-03NFS: FMODE_READ and friends are C macros, not enum typesChuck Lever
Address a sparse warning: CHECK fs/nfs/nfstrace.c fs/nfs/nfstrace.c: note: in included file (through /home/cel/src/linux/rpc-over-tls/include/trace/trace_events.h, /home/cel/src/linux/rpc-over-tls/include/trace/define_trace.h, ...): fs/nfs/./nfstrace.h:424:1: warning: incorrect type in initializer (different base types) fs/nfs/./nfstrace.h:424:1: expected unsigned long eval_value fs/nfs/./nfstrace.h:424:1: got restricted fmode_t [usertype] fs/nfs/./nfstrace.h:425:1: warning: incorrect type in initializer (different base types) fs/nfs/./nfstrace.h:425:1: expected unsigned long eval_value fs/nfs/./nfstrace.h:425:1: got restricted fmode_t [usertype] fs/nfs/./nfstrace.h:426:1: warning: incorrect type in initializer (different base types) fs/nfs/./nfstrace.h:426:1: expected unsigned long eval_value fs/nfs/./nfstrace.h:426:1: got restricted fmode_t [usertype] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-03NFS: Fix a potential NULL dereference in nfs_get_client()Dan Carpenter
None of the callers are expecting NULL returns from nfs_get_client() so this code will lead to an Oops. It's better to return an error pointer. I expect that this is dead code so hopefully no one is affected. Fixes: 31434f496abb ("nfs: check hostname in nfs_get_client") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-03NFS: Fix use-after-free in nfs4_init_client()Anna Schumaker
KASAN reports a use-after-free when attempting to mount two different exports through two different NICs that belong to the same server. Olga was able to hit this with kernels starting somewhere between 5.7 and 5.10, but I traced the patch that introduced the clear_bit() call to 4.13. So something must have changed in the refcounting of the clp pointer to make this call to nfs_put_client() the very last one. Fixes: 8dcbec6d20 ("NFSv41: Handle EXCHID4_FLAG_CONFIRMED_R during NFSv4.1 migration") Cc: stable@vger.kernel.org # 4.13+ Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-03NFS: Ensure the NFS_CAP_SECURITY_LABEL capability is set when appropriateScott Mayhew
Commit ce62b114bbad ("NFS: Split attribute support out from the server capabilities") removed the logic from _nfs4_server_capabilities() that sets the NFS_CAP_SECURITY_LABEL capability based on the presence of FATTR4_WORD2_SECURITY_LABEL in the attr_bitmask of the server's response. Now NFS_CAP_SECURITY_LABEL is never set, which breaks labelled NFS. This was replaced with logic that clears the NFS_ATTR_FATTR_V4_SECURITY_LABEL bit in the newly added fattr_valid field based on the absence of FATTR4_WORD2_SECURITY_LABEL in the attr_bitmask of the server's response. This essentially has no effect since there's nothing looks for that bit in fattr_supported. So revert that part of the commit, but adding the logic that sets NFS_CAP_SECURITY_LABEL near where the other capabilities are set in _nfs4_server_capabilities(). Fixes: ce62b114bbad ("NFS: Split attribute support out from the server capabilities") Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2021-06-03sched/fair: Fix util_est UTIL_AVG_UNCHANGED handlingDietmar Eggemann
The util_est internal UTIL_AVG_UNCHANGED flag which is used to prevent unnecessary util_est updates uses the LSB of util_est.enqueued. It is exposed via _task_util_est() (and task_util_est()). Commit 92a801e5d5b7 ("sched/fair: Mask UTIL_AVG_UNCHANGED usages") mentions that the LSB is lost for util_est resolution but find_energy_efficient_cpu() checks if task_util_est() returns 0 to return prev_cpu early. _task_util_est() returns the max value of util_est.ewma and util_est.enqueued or'ed w/ UTIL_AVG_UNCHANGED. So task_util_est() returning the max of task_util() and _task_util_est() will never return 0 under the default SCHED_FEAT(UTIL_EST, true). To fix this use the MSB of util_est.enqueued instead and keep the flag util_est internal, i.e. don't export it via _task_util_est(). The maximal possible util_avg value for a task is 1024 so the MSB of 'unsigned int util_est.enqueued' isn't used to store a util value. As a caveat the code behind the util_est_se trace point has to filter UTIL_AVG_UNCHANGED to see the real util_est.enqueued value which should be easy to do. This also fixes an issue report by Xuewen Yan that util_est_update() only used UTIL_AVG_UNCHANGED for the subtrahend of the equation: last_enqueued_diff = ue.enqueued - (task_util() | UTIL_AVG_UNCHANGED) Fixes: b89997aa88f0b sched/pelt: Fix task util_est update filtering Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Xuewen Yan <xuewen.yan@unisoc.com> Reviewed-by: Vincent Donnefort <vincent.donnefort@arm.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lore.kernel.org/r/20210602145808.1562603-1-dietmar.eggemann@arm.com
2021-06-03Merge tag 'nvme-5.13-2021-06-03' of git://git.infradead.org/nvme into block-5.13Jens Axboe
Pull NVMe fixes from Christoph: "nvme fixes for Linux 5.13: - fix corruption in RDMA in-capsule SGLs (Sagi Grimberg) - nvme-loop reset fixes (Hannes Reinecke) - nvmet fix for freeing unallocated p2pmem (Max Gurtovoy)" * tag 'nvme-5.13-2021-06-03' of git://git.infradead.org/nvme: nvmet: fix freeing unallocated p2pmem nvme-loop: do not warn for deleted controllers during reset nvme-loop: check for NVME_LOOP_Q_LIVE in nvme_loop_destroy_admin_queue() nvme-loop: clear NVME_LOOP_Q_LIVE when nvme_loop_configure_admin_queue() fails nvme-loop: reset queue count to 1 in nvme_loop_destroy_io_queues() nvme-rdma: fix in-casule data send for chained sgls
2021-06-03MAINTAINERS: add btrfs IRC linkDavid Sterba
We haven't had an IRC link before but now it's a good time to announce where to reach the community. Signed-off-by: David Sterba <dsterba@suse.com>
2021-06-03spi: stm32-qspi: Always wait BUSY bit to be cleared in stm32_qspi_wait_cmd()Patrice Chotard
In U-boot side, an issue has been encountered when QSPI source clock is running at low frequency (24 MHz for example), waiting for TCF bit to be set didn't ensure that all data has been send out the FIFO, we should also wait that BUSY bit is cleared. To prevent similar issue in kernel driver, we implement similar behavior by always waiting BUSY bit to be cleared. Signed-off-by: Patrice Chotard <patrice.chotard@foss.st.com> Link: https://lore.kernel.org/r/20210603073421.8441-1-patrice.chotard@foss.st.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-03regulator: hi6421v600: Fix .vsel_mask settingAxel Lin
Take ldo3_voltages as example, the ARRAY_SIZE(ldo3_voltages) is 16. i.e. the valid selector is 0 ~ 0xF. But in current code the vsel_mask is "(1 << 15) - 1", i.e. 0x7FFF. Fix it. Signed-off-by: Axel Lin <axel.lin@ingics.com> Link: https://lore.kernel.org/r/20210529013236.373847-1-axel.lin@ingics.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-03ASoC: tas2562: Fix TDM_CFG0_SAMPRATE valuesRichard Weinberger
TAS2562_TDM_CFG0_SAMPRATE_MASK starts at bit 1, not 0. So all values need to be left shifted by 1. Signed-off-by: Richard Weinberger <richard@nod.at> Link: https://lore.kernel.org/r/20210530203446.19022-1-richard@nod.at Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-03ASoC: meson: gx-card: fix sound-dai dt schemaJerome Brunet
There is a fair amount of warnings when running 'make dtbs_check' with amlogic,gx-sound-card.yaml. Ex: arch/arm64/boot/dts/amlogic/meson-gxm-q200.dt.yaml: sound: dai-link-0:sound-dai:0:1: missing phandle tag in 0 arch/arm64/boot/dts/amlogic/meson-gxm-q200.dt.yaml: sound: dai-link-0:sound-dai:0:2: missing phandle tag in 0 arch/arm64/boot/dts/amlogic/meson-gxm-q200.dt.yaml: sound: dai-link-0:sound-dai:0: [66, 0, 0] is too long The reason is that the sound-dai phandle provided has cells, and in such case the schema should use 'phandle-array' instead of 'phandle'. Fixes: fd00366b8e41 ("ASoC: meson: gx: add sound card dt-binding documentation") Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Link: https://lore.kernel.org/r/20210524093448.357140-1-jbrunet@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-03ASoC: AMD Renoir: Remove fix for DMI entry on Lenovo 2020 platformsMark Pearson
Unfortunately the previous patch to fix issues using the AMD ACP bridge has the side effect of breaking the dmic in other cases and needs to be reverted. Removing the changes while we revisit the fix and find something better. Apologies for the churn. Suggested-by: Gabriel Craciunescu <unix.or.die@gmail.com> Signed-off-by: Mark Pearson <markpearson@lenovo.com> Link: https://lore.kernel.org/r/20210602171251.3243-1-markpearson@lenovo.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-06-03platform/surface: aggregator: Fix event disable functionMaximilian Luz
Disabling events silently fails due to the wrong command ID being used. Instead of the command ID for the disable call, the command ID for the enable call was being used. This causes the disable call to enable the event instead. As the event is already enabled when we call this function, the EC silently drops this command and does nothing. Use the correct command ID for disabling the event to fix this. Fixes: c167b9c7e3d6 ("platform/surface: Add Surface Aggregator subsystem") Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/20210603000636.568846-1-luzmaximilian@gmail.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-06-03sched/pelt: Ensure that *_sum is always synced with *_avgVincent Guittot
Rounding in PELT calculation happening when entities are attached/detached of a cfs_rq can result into situations where util/runnable_avg is not null but util/runnable_sum is. This is normally not possible so we need to ensure that util/runnable_sum stays synced with util/runnable_avg. detach_entity_load_avg() is the last place where we don't sync util/runnable_sum with util/runnbale_avg when moving some sched_entities Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210601085832.12626-1-vincent.guittot@linaro.org
2021-06-03ieee802154: fix error return code in ieee802154_llsec_getparams()Wei Yongjun
Fix to return negative error code -ENOBUFS from the error handling case instead of 0, as done elsewhere in this function. Fixes: 3e9c156e2c21 ("ieee802154: add netlink interfaces for llsec") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Link: https://lore.kernel.org/r/20210519141614.3040055-1-weiyongjun1@huawei.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2021-06-03ieee802154: fix error return code in ieee802154_add_iface()Zhen Lei
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: be51da0f3e34 ("ieee802154: Stop using NLA_PUT*().") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/r/20210508062517.2574-1-thunder.leizhen@huawei.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2021-06-03net: ieee802154: mrf24j40: Drop unneeded of_match_ptr()Andy Shevchenko
Driver can be used in different environments and moreover, when compiled with !OF, the compiler may issue a warning due to unused mrf24j40_of_match variable. Hence drop unneeded of_match_ptr() call. While at it, update headers block to reflect above changes. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20210531132226.47081-1-andriy.shevchenko@linux.intel.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2021-06-03net/ieee802154: drop unneeded assignment in llsec_iter_devkeys()Yang Li
In order to keep the code style consistency of the whole file, redundant return value ‘rc’ and its assignments should be deleted The clang_analyzer complains as follows: net/ieee802154/nl-mac.c:1203:12: warning: Although the value stored to 'rc' is used in the enclosing expression, the value is never actually read from 'rc' No functional change, only more efficient. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/1619346299-40237-1-git-send-email-yang.lee@linux.alibaba.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2021-06-03ALSA: hda: update the power_state during the direct-completeHui Wang
The patch_realtek.c needs to check if the power_state.event equals PM_EVENT_SUSPEND, after using the direct-complete, the suspend() and resume() will be skipped if the codec is already rt_suspended, in this case, the patch_realtek.c will always get PM_EVENT_ON even the system is really resumed from S3. We could set power_state to PMSG_SUSPEND in the prepare(), if other PM functions are called before complete(), those functions will override power_state; if no other PM functions are called before complete(), we could know the suspend() and resume() are skipped since only S3 pm functions could be skipped by direct-complete, in this case set power_state to PMSG_RESUME in the complete(). This could guarantee the first time of calling hda_codec_runtime_resume() after complete() has the correct power_state. Fixes: 215a22ed31a1 ("ALSA: hda: Refactor codec PM to use direct-complete optimization") Cc: <stable@vger.kernel.org> Signed-off-by: Hui Wang <hui.wang@canonical.com> Link: https://lore.kernel.org/r/20210602145424.3132-1-hui.wang@canonical.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-06-03ALSA: timer: Fix master timer notificationTakashi Iwai
snd_timer_notify1() calls the notification to each slave for a master event, but it passes a wrong event number. It should be +10 offset, corresponding to SNDRV_TIMER_EVENT_MXXX, but it's incorrectly with +100 offset. Casually this was spotted by UBSAN check via syzkaller. Reported-by: syzbot+d102fa5b35335a7e544e@syzkaller.appspotmail.com Reviewed-by: Jaroslav Kysela <perex@perex.cz> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/000000000000e5560e05c3bd1d63@google.com Link: https://lore.kernel.org/r/20210602113823.23777-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-06-03phy: Sparx5 Eth SerDes: check return value after calling platform_get_resource()Yang Yingliang
It will cause null-ptr-deref if platform_get_resource() returns NULL, we need check the return value. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20210603051014.2674744-1-yangyingliang@huawei.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-06-03phy: ralink: phy-mt7621-pci: drop 'of_match_ptr' to fix -Wunused-const-variableSergio Paracuellos
The of_device_id is included unconditionally by of.h header and used in the driver as well. Remove of_match_ptr to fix W=1 compile test warning with !CONFIG_OF: drivers/phy/ralink/phy-mt7621-pci.c:341:34: warning: unused variable 'mt7621_pci_phy_ids' [-Wunused-const-variable] Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com> Link: https://lore.kernel.org/r/20210603043219.32646-1-sergio.paracuellos@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-06-02ext4: fix accessing uninit percpu counter variable with fast_commitRitesh Harjani
When running generic/527 with fast_commit configuration, the following issue is seen on Power. With fast_commit, during ext4_fc_replay() (which can be called from ext4_fill_super()), if inode eviction happens then it can access an uninitialized percpu counter variable. This patch adds the check before accessing the counters in ext4_free_inode() path. [ 321.165371] run fstests generic/527 at 2021-04-29 08:38:43 [ 323.027786] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: block_validity. Quota mode: none. [ 323.618772] BUG: Unable to handle kernel data access on read at 0x1fbd80000 [ 323.619767] Faulting instruction address: 0xc000000000bae78c cpu 0x1: Vector: 300 (Data Access) at [c000000010706ef0] pc: c000000000bae78c: percpu_counter_add_batch+0x3c/0x100 lr: c0000000006d0bb0: ext4_free_inode+0x780/0xb90 pid = 5593, comm = mount ext4_free_inode+0x780/0xb90 ext4_evict_inode+0xa8c/0xc60 evict+0xfc/0x1e0 ext4_fc_replay+0xc50/0x20f0 do_one_pass+0xfe0/0x1350 jbd2_journal_recover+0x184/0x2e0 jbd2_journal_load+0x1c0/0x4a0 ext4_fill_super+0x2458/0x4200 mount_bdev+0x1dc/0x290 ext4_mount+0x28/0x40 legacy_get_tree+0x4c/0xa0 vfs_get_tree+0x4c/0x120 path_mount+0xcf8/0xd70 do_mount+0x80/0xd0 sys_mount+0x3fc/0x490 system_call_exception+0x384/0x3d0 system_call_common+0xec/0x278 Cc: stable@kernel.org Fixes: 8016e29f4362 ("ext4: fast commit recovery path") Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://lore.kernel.org/r/6cceb9a75c54bef8fa9696c1b08c8df5ff6169e2.1619692410.git.riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-06-02amd/display: convert DRM_DEBUG_ATOMIC to drm_dbg_atomicSimon Ser
This allows to tie the log message to a specific DRM device. Signed-off-by: Simon Ser <contact@emersion.fr> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Harry Wentland <hwentlan@amd.com> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-02drm/amdgpu: make sure we unpin the UVD BONirmoy Das
Releasing pinned BOs is illegal now. UVD 6 was missing from: commit 2f40801dc553 ("drm/amdgpu: make sure we unpin the UVD BO") Fixes: 2f40801dc553 ("drm/amdgpu: make sure we unpin the UVD BO") Cc: stable@vger.kernel.org Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-02drm/amd/amdgpu:save psp ring wptr to avoid attackVictor Zhao
[Why] When some tools performing psp mailbox attack, the readback value of register can be a random value which may break psp. [How] Use a psp wptr cache machanism to aovid the change made by attack. v2: unify change and add detailed reason Signed-off-by: Victor Zhao <Victor.Zhao@amd.com> Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-02drm/amd/display: Fix potential memory leak in DMUB hw_initRoman Li
[Why] On resume we perform DMUB hw_init which allocates memory: dm_resume->dm_dmub_hw_init->dc_dmub_srv_create->kzalloc That results in memory leak in suspend/resume scenarios. [How] Allocate memory for the DC wrapper to DMUB only if it was not allocated before. No need to reallocate it on suspend/resume. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-02drm/amdgpu: Don't query CE and UE errorsLuben Tuikov
On QUERY2 IOCTL don't query counts of correctable and uncorrectable errors, since when RAS is enabled and supported on Vega20 server boards, this takes insurmountably long time, in O(n^3), which slows the system down to the point of it being unusable when we have GUI up. Fixes: ae363a212b14 ("drm/amdgpu: Add a new flag to AMDGPU_CTX_OP_QUERY_STATE2") Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-02drm/amd/display: Fix overlay validation by considering cursorsRodrigo Siqueira
A few weeks ago, we saw a two cursor issue in a ChromeOS system. We fixed it in the commit: drm/amd/display: Fix two cursor duplication when using overlay (read the commit message for more details) After this change, we noticed that some IGT subtests related to kms_plane and kms_plane_scaling started to fail. After investigating this issue, we noticed that all subtests that fail have a primary plane covering the overlay plane, which is currently rejected by amdgpu dm. Fail those IGT tests highlight that our verification was too broad and compromises the overlay usage in our drive. This patch fixes this issue by ensuring that we only reject commits where the primary plane is not fully covered by the overlay when the cursor hardware is enabled. With this fix, all IGT tests start to pass again, which means our overlay support works as expected. Cc: Tianci.Yin <tianci.yin@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Nicholas Choi <nicholas.choi@amd.com> Cc: Bhawanpreet Lakha <bhawanpreet.lakha@amd.com> Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Cc: Mark Yacoub <markyacoub@google.com> Cc: Daniel Wheeler <daniel.wheeler@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-02drm/amdgpu: refine amdgpu_fru_get_product_infoJiansong Chen
1. eliminate potential array index out of bounds. 2. return meaningful value for failure. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Jack Gui <Jack.Gui@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-02drm/amdgpu: add judgement for dc supportAsher Song
Drop DC initialization when DCN is harvested in VBIOS. The way doesn't affect virtual display ip initialization. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Asher Song <Asher.Song@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-02drm/amd/display: Fix GPU scaling regression by FS video supportNicholas Kazlauskas
[Why] FS video support regressed GPU scaling and the scaled buffer ends up stuck in the top left of the screen at native size - full, aspect, center scaling modes do not function. This is because decide_crtc_timing_for_drm_display_mode() does not get called when scaling is enabled. [How] Split recalculate timing and scaling into two different flags. We don't want to call drm_mode_set_crtcinfo() for scaling, but we do want to call it for FS video. Optimize and move preferred_refresh calculation next to decide_crtc_timing_for_drm_display_mode() like it used to be since that's not used for FS video. We don't need to copy over the VIC or polarity in the case of FS video modes because those don't change. Fixes: 6f59f229f8ed7a ("drm/amd/display: Skip modeset for front porch change") Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-02drm/amd/display: Allow bandwidth validation for 0 streams.Bindu Ramamurthy
[Why] Bandwidth calculations are triggered for non zero streams, and in case of 0 streams, these calculations were skipped with pstate status not being updated. [How] As the pstate status is applicable for non zero streams, check added for allowing 0 streams inline with dcn internal bandwidth validations. Signed-off-by: Bindu Ramamurthy <bindu.r@amd.com> Reviewed-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-02net: stmmac: fix issue where clk is being unprepared twiceWong Vee Khee
In the case of MDIO bus registration failure due to no external PHY devices is connected to the MAC, clk_disable_unprepare() is called in stmmac_bus_clk_config() and intel_eth_pci_probe() respectively. The second call in intel_eth_pci_probe() will caused the following:- [ 16.578605] intel-eth-pci 0000:00:1e.5: No PHY found [ 16.583778] intel-eth-pci 0000:00:1e.5: stmmac_dvr_probe: MDIO bus (id: 2) registration failed [ 16.680181] ------------[ cut here ]------------ [ 16.684861] stmmac-0000:00:1e.5 already disabled [ 16.689547] WARNING: CPU: 13 PID: 2053 at drivers/clk/clk.c:952 clk_core_disable+0x96/0x1b0 [ 16.697963] Modules linked in: dwc3 iTCO_wdt mei_hdcp iTCO_vendor_support udc_core x86_pkg_temp_thermal kvm_intel marvell10g kvm sch_fq_codel nfsd irqbypass dwmac_intel(+) stmmac uio ax88179_178a pcs_xpcs phylink uhid spi_pxa2xx_platform usbnet mei_me pcspkr tpm_crb mii i2c_i801 dw_dmac dwc3_pci thermal dw_dmac_core intel_rapl_msr libphy i2c_smbus mei tpm_tis intel_th_gth tpm_tis_core tpm intel_th_acpi intel_pmc_core intel_th i915 fuse configfs snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core snd_pcm snd_timer snd soundcore [ 16.746785] CPU: 13 PID: 2053 Comm: systemd-udevd Tainted: G U 5.13.0-rc3-intel-lts #76 [ 16.756134] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-S ADP-S DRR4 CRB, BIOS ADLIFSI1.R00.1494.B00.2012031421 12/03/2020 [ 16.769465] RIP: 0010:clk_core_disable+0x96/0x1b0 [ 16.774222] Code: 00 8b 05 45 96 17 01 85 c0 7f 24 48 8b 5b 30 48 85 db 74 a5 8b 43 7c 85 c0 75 93 48 8b 33 48 c7 c7 6e 32 cc b7 e8 b2 5d 52 00 <0f> 0b 5b 5d c3 65 8b 05 76 31 18 49 89 c0 48 0f a3 05 bc 92 1a 01 [ 16.793016] RSP: 0018:ffffa44580523aa0 EFLAGS: 00010086 [ 16.798287] RAX: 0000000000000000 RBX: ffff8d7d0eb70a00 RCX: 0000000000000000 [ 16.805435] RDX: 0000000000000002 RSI: ffffffffb7c62d5f RDI: 00000000ffffffff [ 16.812610] RBP: 0000000000000287 R08: 0000000000000000 R09: ffffa445805238d0 [ 16.819759] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8d7d0eb70a00 [ 16.826904] R13: ffff8d7d027370c8 R14: 0000000000000006 R15: ffffa44580523ad0 [ 16.834047] FS: 00007f9882fa2600(0000) GS:ffff8d80a0940000(0000) knlGS:0000000000000000 [ 16.842177] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 16.847966] CR2: 00007f9882bea3d8 CR3: 000000010b126001 CR4: 0000000000370ee0 [ 16.855144] Call Trace: [ 16.857614] clk_core_disable_lock+0x1b/0x30 [ 16.861941] intel_eth_pci_probe.cold+0x11d/0x136 [dwmac_intel] [ 16.867913] pci_device_probe+0xcf/0x150 [ 16.871890] really_probe+0xf5/0x3e0 [ 16.875526] driver_probe_device+0x64/0x150 [ 16.879763] device_driver_attach+0x53/0x60 [ 16.883998] __driver_attach+0x9f/0x150 [ 16.887883] ? device_driver_attach+0x60/0x60 [ 16.892288] ? device_driver_attach+0x60/0x60 [ 16.896698] bus_for_each_dev+0x77/0xc0 [ 16.900583] bus_add_driver+0x184/0x1f0 [ 16.904469] driver_register+0x6c/0xc0 [ 16.908268] ? 0xffffffffc07ae000 [ 16.911598] do_one_initcall+0x4a/0x210 [ 16.915489] ? kmem_cache_alloc_trace+0x305/0x4e0 [ 16.920247] do_init_module+0x5c/0x230 [ 16.924057] load_module+0x2894/0x2b70 [ 16.927857] ? __do_sys_finit_module+0xb5/0x120 [ 16.932441] __do_sys_finit_module+0xb5/0x120 [ 16.936845] do_syscall_64+0x42/0x80 [ 16.940476] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 16.945586] RIP: 0033:0x7f98830e5ccd [ 16.949177] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 93 31 0c 00 f7 d8 64 89 01 48 [ 16.967970] RSP: 002b:00007ffc66b60168 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 16.975583] RAX: ffffffffffffffda RBX: 000055885de35ef0 RCX: 00007f98830e5ccd [ 16.982725] RDX: 0000000000000000 RSI: 00007f98832541e3 RDI: 0000000000000012 [ 16.989868] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000000 [ 16.997042] R10: 0000000000000012 R11: 0000000000000246 R12: 00007f98832541e3 [ 17.004222] R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffc66b60328 [ 17.011369] ---[ end trace df06a3dab26b988c ]--- [ 17.016062] ------------[ cut here ]------------ [ 17.020701] stmmac-0000:00:1e.5 already unprepared Removing the stmmac_bus_clks_config() call in stmmac_dvr_probe and let dwmac-intel to handle the unprepare and disable of the clk device. Fixes: 5ec55823438e ("net: stmmac: add clocks management for gmac driver") Cc: Joakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Reviewed-by: Joakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-02net: ipconfig: Don't override command-line hostnames or domainsJosh Triplett
If the user specifies a hostname or domain name as part of the ip= command-line option, preserve it and don't overwrite it with one supplied by DHCP/BOOTP. For instance, ip=::::myhostname::dhcp will use "myhostname" rather than ignoring and overwriting it. Fix the comment on ic_bootp_string that suggests it only copies a string "if not already set"; it doesn't have any such logic. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-02Merge tag 'mlx5-fixes-2021-06-01' of git://git.kernel.org/pub/scm/linuDavid S. Miller
x/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2021-06-01 This series introduces some fixes to mlx5 driver. Please pull and let me know if there is any problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-02bpf, lockdown, audit: Fix buggy SELinux lockdown permission checksDaniel Borkmann
Commit 59438b46471a ("security,lockdown,selinux: implement SELinux lockdown") added an implementation of the locked_down LSM hook to SELinux, with the aim to restrict which domains are allowed to perform operations that would breach lockdown. This is indirectly also getting audit subsystem involved to report events. The latter is problematic, as reported by Ondrej and Serhei, since it can bring down the whole system via audit: 1) The audit events that are triggered due to calls to security_locked_down() can OOM kill a machine, see below details [0]. 2) It also seems to be causing a deadlock via avc_has_perm()/slow_avc_audit() when trying to wake up kauditd, for example, when using trace_sched_switch() tracepoint, see details in [1]. Triggering this was not via some hypothetical corner case, but with existing tools like runqlat & runqslower from bcc, for example, which make use of this tracepoint. Rough call sequence goes like: rq_lock(rq) -> -------------------------+ trace_sched_switch() -> | bpf_prog_xyz() -> +-> deadlock selinux_lockdown() -> | audit_log_end() -> | wake_up_interruptible() -> | try_to_wake_up() -> | rq_lock(rq) --------------+ What's worse is that the intention of 59438b46471a to further restrict lockdown settings for specific applications in respect to the global lockdown policy is completely broken for BPF. The SELinux policy rule for the current lockdown check looks something like this: allow <who> <who> : lockdown { <reason> }; However, this doesn't match with the 'current' task where the security_locked_down() is executed, example: httpd does a syscall. There is a tracing program attached to the syscall which triggers a BPF program to run, which ends up doing a bpf_probe_read_kernel{,_str}() helper call. The selinux_lockdown() hook does the permission check against 'current', that is, httpd in this example. httpd has literally zero relation to this tracing program, and it would be nonsensical having to write an SELinux policy rule against httpd to let the tracing helper pass. The policy in this case needs to be against the entity that is installing the BPF program. For example, if bpftrace would generate a histogram of syscall counts by user space application: bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }' bpftrace would then go and generate a BPF program from this internally. One way of doing it [for the sake of the example] could be to call bpf_get_current_task() helper and then access current->comm via one of bpf_probe_read_kernel{,_str}() helpers. So the program itself has nothing to do with httpd or any other random app doing a syscall here. The BPF program _explicitly initiated_ the lockdown check. The allow/deny policy belongs in the context of bpftrace: meaning, you want to grant bpftrace access to use these helpers, but other tracers on the system like my_random_tracer _not_. Therefore fix all three issues at the same time by taking a completely different approach for the security_locked_down() hook, that is, move the check into the program verification phase where we actually retrieve the BPF func proto. This also reliably gets the task (current) that is trying to install the BPF tracing program, e.g. bpftrace/bcc/perf/systemtap/etc, and it also fixes the OOM since we're moving this out of the BPF helper's fast-path which can be called several millions of times per second. The check is then also in line with other security_locked_down() hooks in the system where the enforcement is performed at open/load time, for example, open_kcore() for /proc/kcore access or module_sig_check() for module signatures just to pick few random ones. What's out of scope in the fix as well as in other security_locked_down() hook locations /outside/ of BPF subsystem is that if the lockdown policy changes on the fly there is no retrospective action. This requires a different discussion, potentially complex infrastructure, and it's also not clear whether this can be solved generically. Either way, it is out of scope for a suitable stable fix which this one is targeting. Note that the breakage is specifically on 59438b46471a where it started to rely on 'current' as UAPI behavior, and _not_ earlier infrastructure such as 9d1f8be5cf42 ("bpf: Restrict bpf when kernel lockdown is in confidentiality mode"). [0] https://bugzilla.redhat.com/show_bug.cgi?id=1955585, Jakub Hrozek says: I starting seeing this with F-34. When I run a container that is traced with BPF to record the syscalls it is doing, auditd is flooded with messages like: type=AVC msg=audit(1619784520.593:282387): avc: denied { confidentiality } for pid=476 comm="auditd" lockdown_reason="use of bpf to read kernel RAM" scontext=system_u:system_r:auditd_t:s0 tcontext=system_u:system_r:auditd_t:s0 tclass=lockdown permissive=0 This seems to be leading to auditd running out of space in the backlog buffer and eventually OOMs the machine. [...] auditd running at 99% CPU presumably processing all the messages, eventually I get: Apr 30 12:20:42 fedora kernel: audit: backlog limit exceeded Apr 30 12:20:42 fedora kernel: audit: backlog limit exceeded Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152579 > audit_backlog_limit=64 Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152626 > audit_backlog_limit=64 Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152694 > audit_backlog_limit=64 Apr 30 12:20:42 fedora kernel: audit: audit_lost=6878426 audit_rate_limit=0 audit_backlog_limit=64 Apr 30 12:20:45 fedora kernel: oci-seccomp-bpf invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-1000 Apr 30 12:20:45 fedora kernel: CPU: 0 PID: 13284 Comm: oci-seccomp-bpf Not tainted 5.11.12-300.fc34.x86_64 #1 Apr 30 12:20:45 fedora kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014 [...] [1] https://lore.kernel.org/linux-audit/CANYvDQN7H5tVp47fbYcRasv4XF07eUbsDwT_eDCHXJUj43J7jQ@mail.gmail.com/, Serhei Makarov says: Upstream kernel 5.11.0-rc7 and later was found to deadlock during a bpf_probe_read_compat() call within a sched_switch tracepoint. The problem is reproducible with the reg_alloc3 testcase from SystemTap's BPF backend testsuite on x86_64 as well as the runqlat, runqslower tools from bcc on ppc64le. Example stack trace: [...] [ 730.868702] stack backtrace: [ 730.869590] CPU: 1 PID: 701 Comm: in:imjournal Not tainted, 5.12.0-0.rc2.20210309git144c79ef3353.166.fc35.x86_64 #1 [ 730.871605] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 [ 730.873278] Call Trace: [ 730.873770] dump_stack+0x7f/0xa1 [ 730.874433] check_noncircular+0xdf/0x100 [ 730.875232] __lock_acquire+0x1202/0x1e10 [ 730.876031] ? __lock_acquire+0xfc0/0x1e10 [ 730.876844] lock_acquire+0xc2/0x3a0 [ 730.877551] ? __wake_up_common_lock+0x52/0x90 [ 730.878434] ? lock_acquire+0xc2/0x3a0 [ 730.879186] ? lock_is_held_type+0xa7/0x120 [ 730.880044] ? skb_queue_tail+0x1b/0x50 [ 730.880800] _raw_spin_lock_irqsave+0x4d/0x90 [ 730.881656] ? __wake_up_common_lock+0x52/0x90 [ 730.882532] __wake_up_common_lock+0x52/0x90 [ 730.883375] audit_log_end+0x5b/0x100 [ 730.884104] slow_avc_audit+0x69/0x90 [ 730.884836] avc_has_perm+0x8b/0xb0 [ 730.885532] selinux_lockdown+0xa5/0xd0 [ 730.886297] security_locked_down+0x20/0x40 [ 730.887133] bpf_probe_read_compat+0x66/0xd0 [ 730.887983] bpf_prog_250599c5469ac7b5+0x10f/0x820 [ 730.888917] trace_call_bpf+0xe9/0x240 [ 730.889672] perf_trace_run_bpf_submit+0x4d/0xc0 [ 730.890579] perf_trace_sched_switch+0x142/0x180 [ 730.891485] ? __schedule+0x6d8/0xb20 [ 730.892209] __schedule+0x6d8/0xb20 [ 730.892899] schedule+0x5b/0xc0 [ 730.893522] exit_to_user_mode_prepare+0x11d/0x240 [ 730.894457] syscall_exit_to_user_mode+0x27/0x70 [ 730.895361] entry_SYSCALL_64_after_hwframe+0x44/0xae [...] Fixes: 59438b46471a ("security,lockdown,selinux: implement SELinux lockdown") Reported-by: Ondrej Mosnacek <omosnace@redhat.com> Reported-by: Jakub Hrozek <jhrozek@redhat.com> Reported-by: Serhei Makarov <smakarov@redhat.com> Reported-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: Paul Moore <paul@paul-moore.com> Cc: James Morris <jamorris@linux.microsoft.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Frank Eigler <fche@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/bpf/01135120-8bf7-df2e-cff0-1d73f1f841c3@iogearbox.net
2021-06-02vmlinux.lds.h: Avoid orphan section with !SMPNathan Chancellor
With x86_64_defconfig and the following configs, there is an orphan section warning: CONFIG_SMP=n CONFIG_AMD_MEM_ENCRYPT=y CONFIG_HYPERVISOR_GUEST=y CONFIG_KVM=y CONFIG_PARAVIRT=y ld: warning: orphan section `.data..decrypted' from `arch/x86/kernel/cpu/vmware.o' being placed in section `.data..decrypted' ld: warning: orphan section `.data..decrypted' from `arch/x86/kernel/kvm.o' being placed in section `.data..decrypted' These sections are created with DEFINE_PER_CPU_DECRYPTED, which ultimately turns into __PCPU_ATTRS, which in turn has a section attribute with a value of PER_CPU_BASE_SECTION + the section name. When CONFIG_SMP is not set, the base section is .data and that is not currently handled in any linker script. Add .data..decrypted to PERCPU_DECRYPTED_SECTION, which is included in PERCPU_INPUT -> PERCPU_SECTION, which is include in the x86 linker script when either CONFIG_X86_64 or CONFIG_SMP is unset, taking care of the warning. Fixes: ac26963a1175 ("percpu: Introduce DEFINE_PER_CPU_DECRYPTED") Link: https://github.com/ClangBuiltLinux/linux/issues/1360 Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> # build Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210506001410.1026691-1-nathan@kernel.org