summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-01-27io_uring: always prep_async for drain requestsDylan Yudaken
Drain requests all go through io_drain_req, which has a quick exit in case there is nothing pending (ie the drain is not useful). In that case it can run the issue the request immediately. However for safety it queues it through task work. The problem is that in this case the request is run asynchronously, but the async work has not been prepared through io_req_prep_async. This has not been a problem up to now, as the task work always would run before returning to userspace, and so the user would not have a chance to race with it. However - with IORING_SETUP_DEFER_TASKRUN - this is no longer the case and the work might be defered, giving userspace a chance to change data being referred to in the request. Instead _always_ prep_async for drain requests, which is simpler anyway and removes this issue. Cc: stable@vger.kernel.org Fixes: c0e0d6ba25f1 ("io_uring: add IORING_SETUP_DEFER_TASKRUN") Signed-off-by: Dylan Yudaken <dylany@meta.com> Link: https://lore.kernel.org/r/20230127105911.2420061-1-dylany@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-01-27tools: gpio: fix -c option of gpio-event-monIvo Borisov Shopov
Following line should listen for a rising edge and exit after the first one since '-c 1' is provided. # gpio-event-mon -n gpiochip1 -o 0 -r -c 1 It works with kernel 4.19 but it doesn't work with 5.10. In 5.10 the above command doesn't exit after the first rising edge it keep listening for an event forever. The '-c 1' is not taken into an account. The problem is in commit 62757c32d5db ("tools: gpio: add multi-line monitoring to gpio-event-mon"). Before this commit the iterator 'i' in monitor_device() is used for counting of the events (loops). In the case of the above command (-c 1) we should start from 0 and increment 'i' only ones and hit the 'break' statement and exit the process. But after the above commit counting doesn't start from 0, it start from 1 when we listen on one line. It is because 'i' is used from one more purpose, counting of lines (num_lines) and it isn't restore to 0 after following code for (i = 0; i < num_lines; i++) gpiotools_set_bit(&values.mask, i); Restore the initial value of the iterator to 0 in order to allow counting of loops to work for any cases. Fixes: 62757c32d5db ("tools: gpio: add multi-line monitoring to gpio-event-mon") Signed-off-by: Ivo Borisov Shopov <ivoshopov@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> [Bartosz: tweak the commit message] Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2023-01-27gpio: ep93xx: remove unused variableArnd Bergmann
This one was left behind by a previous cleanup patch: drivers/gpio/gpio-ep93xx.c: In function 'ep93xx_gpio_add_bank': drivers/gpio/gpio-ep93xx.c:366:34: error: unused variable 'ic' [-Werror=unused-variable] Fixes: 216f37366e86 ("gpio: ep93xx: Make irqchip immutable") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2023-01-27Merge branch 'devlink-parama-cleanup'David S. Miller
Jiri Pirko says: ==================== devlink: Cleanup params usage This patchset takes care of small cleanup of devlink params usage. Some of the patches (first 2/3) are cosmetic, but I would like to point couple of interesting ones: Patch 9 is the main one of this set and introduces devlink instance locking for params, similar to other devlink objects. That allows params to be registered/unregistered when devlink instance is registered. Patches 10-12 change mlx5 code to register non-driverinit params in the code they are related to, and thanks to patch 8 this might be when devlink instance is registered - for example during devlink reload. --- v1->v2: - Just small fix in the last patch ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net/mlx5: Move eswitch port metadata devlink param to flow eswitch codeJiri Pirko
Move the param registration and handling code into the eswitch offloads code as they are related to each other. No point in having the devlink param registration done in separate file. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net/mlx5: Move flow steering devlink param to flow steering codeJiri Pirko
Move the param registration and handling code into the flow steering code as they are related to each other. No point in having the devlink param registration done in separate file. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net/mlx5: Move fw reset devlink param to fw reset codeJiri Pirko
Move the param registration and handling code into the fw reset code as they are related to each other. No point in having the devlink param registration done in separate file. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27devlink: protect devlink param list by instance lockJiri Pirko
Commit 1d18bb1a4ddd ("devlink: allow registering parameters after the instance") as the subject implies introduced possibility to register devlink params even for already registered devlink instance. This is a bit problematic, as the consistency or params list was originally secured by the fact it is static during devlink lifetime. So in order to protect the params list, take devlink instance lock during the params operations. Introduce unlocked function variants and use them in drivers in locked context. Put lock assertions to appropriate places. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27devlink: put couple of WARN_ONs in devlink_param_driverinit_value_get()Jiri Pirko
Put couple of WARN_ONs in devlink_param_driverinit_value_get() function to clearly indicate, that it is a driver bug if used without reload support or for non-driverinit param. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27devlink: make devlink_param_driverinit_value_set() return voidJiri Pirko
devlink_param_driverinit_value_set() currently returns int with possible error, but no user is checking it anyway. The only reason for a fail is a driver bug. So convert the function to return void and put WARN_ONs on error paths. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27qed: remove pointless call to devlink_param_driverinit_value_set()Jiri Pirko
devlink_param_driverinit_value_set() call makes sense only for " driverinit" params. However here, the param is "runtime". devlink_param_driverinit_value_set() returns -EOPNOTSUPP in such case and does not do anything. So remove the pointless call to devlink_param_driverinit_value_set() entirely. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27ice: remove pointless calls to devlink_param_driverinit_value_set()Jiri Pirko
devlink_param_driverinit_value_set() call makes sense only for "driverinit" params. However here, both params are "runtime". devlink_param_driverinit_value_set() returns -EOPNOTSUPP in such case and does not do anything. So remove the pointless calls to devlink_param_driverinit_value_set() entirely. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27devlink: don't work with possible NULL pointer in devlink_param_unregister()Jiri Pirko
There is a WARN_ON checking the param_item for being NULL when the param is not inserted in the list. That indicates a driver BUG. Instead of continuing to work with NULL pointer with its consequences, return. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27devlink: make devlink_param_register/unregister staticJiri Pirko
There is no user outside the devlink code, so remove the export and make the functions static. Move them before callers to avoid forward declarations. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net/mlx5: Covert devlink params registration to use ↵Jiri Pirko
devlink_params_register/unregister() Since mlx5 is the only user of devlink API to register/unregister a single param, convert it to use array registration function allowing to simplify the devlink API by removing the single param registration functions. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net/mlx5: Change devlink param register/unregister function namesJiri Pirko
The functions are registering and unregistering devlink params, so change the names accordingly. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27Merge branch 'ethtool-netlink-next'David S. Miller
Jakub Kicinski says: ==================== ethtool: netlink: handle SET intro/outro in the common code Factor out the boilerplate code from SET handlers to common code. I volunteered to refactor the extack in GET in a conversation with Vladimir but I gave up. The handling of failures during dump in GET handlers is a bit unclear to me. Some code uses presence of info as indication of dump and tries to avoid reporting errors altogether (including extack messages). There's also the question of whether we should have a validation callback (similar to .set_validate here) for GET. It looks like .parse_request was expected to perform the validation. It takes the extack and tb directly, not via info: int (*parse_request)(struct ethnl_req_info *req_info, struct nlattr **tb, struct netlink_ext_ack *extack); int (*prepare_data)(const struct ethnl_req_info *req_info, struct ethnl_reply_data *reply_data, struct genl_info *info); so no crashes dereferencing info possible. But .parse_request doesn't run under rtnl nor ethnl_ops_begin(). As a result some implementations defer validation until .prepare_data where all the locks are held and they can call out to the driver. All this makes me think that maybe we should refactor GET in the same direction I'm refactoring SET. Split .prepare_data, take more locks in the core, and add a validation helper which would take extack directly: - ret = ops->prepare_data(req_info, reply_data, info); + ret = ops->prepare_data_validate(req_info, reply_data, attrs, extack); + if (ret < 1) // if 0 -> skip for dump; -EOPNOTSUPP in do + goto err1; + + ret = ethnl_ops_begin(dev); + if (ret) + goto err1; + + ret = ops->prepare_data(req_info, reply_data); // no extack + ethnl_ops_complete(dev); I'll file that away as a TODO for posterity / older me. v2: - invert checks for coalescing to avoid error code changes - rebase and convert MM as well v1: https://lore.kernel.org/all/20230121054430.642280-1-kuba@kernel.org/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27ethtool: netlink: convert commands to common SETJakub Kicinski
Convert all SET commands where new common code is applicable. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27ethtool: netlink: handle SET intro/outro in the common codeJakub Kicinski
Most ethtool SET callbacks follow the same general structure. ethnl_parse_header_dev_get() rtnl_lock() ethnl_ops_begin() ... do stuff ... ethtool_notify() ethnl_ops_complete() rtnl_unlock() ethnl_parse_header_dev_put() This leads to a lot of copy / pasted code an bugs when people mis-handle the error path. Add a generic implementation of this pattern with a .set callback in struct ethnl_request_ops called to "do stuff". Also add an optional .set_validate which is called before ethnl_ops_begin() -- a lot of implementations do basic request capability / sanity checking at that point. Because we want to avoid generating the notification when no change happened - adopt a slightly hairy return values: - 0 means nothing to do (no notification) - 1 means done / continue - negative error codes on error Reuse .hdr_attr from struct ethnl_request_ops, GET and SET use the same attr spaces in all cases. Convert pause as an example (and to avoid unused function warnings). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: dsa: qca8k: convert to regmap read/write APIChristian Marangi
Convert qca8k to regmap read/write bulk API. The mgmt eth can write up to 32 bytes of data at times. Currently we use a custom function to do it but regmap now supports declaration of read/write bulk even without a bus. Drop the custom function and rework the regmap function to this new implementation. Rework the qca8k_fdb_read/write function to use the new regmap_bulk_read/write as the old qca8k_bulk_read/write are now dropped. Cc: Mark Brown <broonie@kernel.org> Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: dsa: qca8k: add QCA8K_ATU_TABLE_SIZE define for fdb accessChristian Marangi
Add and use QCA8K_ATU_TABLE_SIZE instead of hardcoding the ATU size with a pure number and using sizeof on the array. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27Merge branch 'net-skbuff-includes'David S. Miller
Jakub Kicinski says: ==================== net: skbuff: clean up unnecessary includes skbuff.h is included in a significant portion of the tree. Clean up unused dependencies to speed up builds. This set only takes care of the most obvious cases. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: remove unnecessary includes from net/flow.hJakub Kicinski
This file is included by a lot of other commonly included headers, it doesn't need socket.h or flow_dissector.h. This reduces the size of this file after pre-processing from 28165 to 4663. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: skbuff: drop the linux/hrtimer.h includeJakub Kicinski
linux/hrtimer.h include was added because apparently it used to contain ktime related code. This is no longer the case and we include linux/time.h explicitly. Sadly this change is currently a noop because linux/dma-mapping.h and net/page_pool.h pull in half of the universe. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: skbuff: drop the linux/splice.h includeJakub Kicinski
splice.h is included since commit a60e3cc7c929 ("net: make skb_splice_bits more configureable") but really even then all we needed is some forward declarations. Most of that code is now gone, and remaining has fwd declarations. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: add missing includes of linux/splice.hJakub Kicinski
Number of files depend on linux/splice.h getting included by linux/skbuff.h which soon will no longer be the case. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: skbuff: drop the linux/sched.h includeJakub Kicinski
linux/sched.h was added for skb_mstamp_* (all the way back before linux/sched.h got split and linux/sched/clock.h created). We don't need it in skbuff.h any more. Sadly this change is currently a noop because linux/dma-mapping.h and net/page_pool.h pull in half of the universe. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: skbuff: drop the linux/sched/clock.h includeJakub Kicinski
It used to be necessary for skb_mstamp_* static inlines, but those are gone since we moved to usec timestamps in TCP, in 2017. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: add missing includes of linux/sched/clock.hJakub Kicinski
Number of files depend on linux/sched/clock.h getting included by linux/skbuff.h which soon will no longer be the case. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: skbuff: drop the linux/textsearch.h includeJakub Kicinski
This include was added for skb_find_text() but all we need there is a forward declaration of struct ts_config. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: checksum: drop the linux/uaccess.h includeJakub Kicinski
net/checksum.h pulls in linux/uaccess.h which is large. In the x86 header the include seems to not be needed at all. ARM on the other hand does not include uaccess.h, even tho it calls access_ok(). In the generic implementation guard the include of linux/uaccess.h with the same condition as the code that needs it. With this change pre-processed net/checksum.h shrinks on x86 from 30616 lines to just 1193. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: skbuff: drop the linux/net.h includeJakub Kicinski
It appears nothing needs it. The kernel builds fine with this include removed, building an otherwise empty source file with: #include <linux/skbuff.h> #ifdef _LINUX_NET_H #error linux/net.h is back #endif works too (meaning net.h is not just pulled in indirectly). This gives us a slight 0.5% reduction in the pre-processed size of skbuff.h. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: add missing includes of linux/net.hJakub Kicinski
linux/net.h will soon not be included by linux/skbuff.h. Fix the cases where source files were depending on the implicit include. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27vdpa: ifcvf: Do proper cleanup if IFCVF init failsTanmay Bhushan
ifcvf_mgmt_dev leaks memory if it is not freed before returning. Call is made to correct return statement so memory does not leak. ifcvf_init_hw does not take care of this so it is needed to do it here. Signed-off-by: Tanmay Bhushan <007047221b@gmail.com> Message-Id: <772e9fe133f21fa78fb98a2ebe8969efbbd58e3c.camel@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Zhu Lingshan <lingshan.zhu@intel.com>
2023-01-27vhost-scsi: unbreak any layout for responseJason Wang
Al Viro said: """ Since "vhost/scsi: fix reuse of &vq->iov[out] in response" we have this: cmd->tvc_resp_iov = vq->iov[vc.out]; cmd->tvc_in_iovs = vc.in; combined with iov_iter_init(&iov_iter, ITER_DEST, &cmd->tvc_resp_iov, cmd->tvc_in_iovs, sizeof(v_rsp)); in vhost_scsi_complete_cmd_work(). We used to have ->tvc_resp_iov _pointing_ to vq->iov[vc.out]; back then iov_iter_init() asked to set an iovec-backed iov_iter over the tail of vq->iov[], with length being the amount of iovecs in the tail. Now we have a copy of one element of that array. Fortunately, the members following it in the containing structure are two non-NULL kernel pointers, so copy_to_iter() will not copy anything beyond the first iovec - kernel pointer is not (on the majority of architectures) going to be accepted by access_ok() in copyout() and it won't be skipped since the "length" (in reality - another non-NULL kernel pointer) won't be zero. So it's not going to give a guest-to-qemu escalation, but it's definitely a bug. Frankly, my preference would be to verify that the very first iovec is long enough to hold rsp_size. Due to the above, any users that try to give us vq->iov[vc.out].iov_len < sizeof(struct virtio_scsi_cmd_resp) would currently get a failure in vhost_scsi_complete_cmd_work() anyway. """ However, the spec doesn't say anything about the legacy descriptor layout for the respone. So this patch tries to not assume the response to reside in a single separate descriptor which is what commit 79c14141a487 ("vhost/scsi: Convert completion path to use") tries to achieve towards to ANY_LAYOUT. This is done by allocating and using dedicate resp iov in the command. To be safety, start with UIO_MAXIOV to be consistent with the limitation that we advertise to the vhost_get_vq_desc(). Testing with the hacked virtio-scsi driver that use 1 descriptor for 1 byte in the response. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin Coddington <bcodding@redhat.com> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Fixes: a77ec83a5789 ("vhost/scsi: fix reuse of &vq->iov[out] in response") Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230119073647.76467-1-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-01-27tools/virtio: fix the vringh test for virtio ring changesShunsuke Mie
Fix the build caused by missing kmsan_handle_dma() and is_power_of_2() that are used in drivers/virtio/virtio_ring.c. Signed-off-by: Shunsuke Mie <mie@igel.co.jp> Message-Id: <20230110034310.779744-1-mie@igel.co.jp> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-01-27vhost/net: Clear the pending messages when the backend is removedEric Auger
When the vhost iotlb is used along with a guest virtual iommu and the guest gets rebooted, some MISS messages may have been recorded just before the reboot and spuriously executed by the virtual iommu after the reboot. As vhost does not have any explicit reset user API, VHOST_NET_SET_BACKEND looks a reasonable point where to clear the pending messages, in case the backend is removed. Export vhost_clear_msg() and call it in vhost_net_set_backend() when fd == -1. Signed-off-by: Eric Auger <eric.auger@redhat.com> Suggested-by: Jason Wang <jasowang@redhat.com> Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API") Message-Id: <20230117151518.44725-3-eric.auger@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-01-27Merge branch 'ipa-abstract-status'David S. Miller
Alex Elder says: ==================== net: ipa: abstract status parsing Under some circumstances, IPA generates a "packet status" structure that describes information about a packet. This is used, for example, when offload hardware detects an error in a packet, or otherwise discovers a packet needs special handling. In this case, the status is delivered (along with the packet it describes) to a "default" endpoint so that it can be handled by the AP. Until now, the structure of this status information hasn't changed. However, to support more than 32 endpoints, this structure required some changes, such that some fields are rearranged in ways that are tricky to represent using C code. This series updates code related to the IPA status structure. The first patch uses a local variable to avoid recomputing a packet length more than once. The second stops using sizeof() to determine the size of an IPA packet status structure. Patches 3-5 extend the definitions for values held in packet status fields. Patch 6 does a little general cleanup to make patch 7 simpler. Patch 7 stops using a C structure to represent packet status; instead, a new function fetches values "by name" from a buffer containing such a structure. The last patch updates this function so it also supports IPA v5.0+. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: ipa: add IPA v5.0 packet status supportAlex Elder
Update ipa_status_extract() to support IPA v5.0 and beyond. Because the format of the IPA packet status depends on the version, pass an IPA pointer to the function. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: ipa: introduce generalized status decoderAlex Elder
Stop assuming the IPA packet status has a fixed format (defined by a C structure). Instead, use a function to extract each field from a block of data interpreted as an IPA packet status. Define an enumerated type that identifies the fields that can be extracted. The current function extracts fields based on the existing ipa_status structure format (which is no longer used). Define IPA_STATUS_RULE_MISS, to replace the calls to field_max() to represent that condition; those depended on the knowing the width of a filter or router rule in the IPA packet status structure. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: ipa: IPA status preparatory cleanupsAlex Elder
The next patch reworks how the IPA packet status structure is interpreted. This patch does some preparatory work, to make it easier to see the effect of that change: - Change a few functions that access fields in a IPA packet status structure to store field values in local variables with names related to the field. - Pass a void pointer rather than an (equivalent) status pointer to two functions called by ipa_endpoint_status_parse(). - Use "rule" rather than "val" as the name of a variable that holds a routing rule ID. - Consistently use "IPA packet status" rather than "status element" when referring to this data structure. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: ipa: define remaining IPA status field valuesAlex Elder
Define the remaining values for opcode and exception fields in the IPA packet status structure. Most of these values are powers-of-2, suggesting they are meant to be used as bitmasks, but that is not the case. Add comments to be clear about this, and express the values in decimal format. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: ipa: rename the NAT enumerated typeAlex Elder
Rename the ipa_nat_en enumerated type to be ipa_nat_type, and rename its symbols accordingly. Add a comment indicating those values are also used in the IPA status nat_type field. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: ipa: define all IPA status mask bitsAlex Elder
There is a 16 bit status mask defined in the IPA packet status structure, of which only one (TAG_VALID) is currently used. Define all other IPA status mask values in an enumerated type whose numeric values are bit mask values (in CPU byte order) in the status mask. Use the TAG_VALID value from that type rather than defining a separate field mask. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: ipa: stop using sizeof(status)Alex Elder
The IPA packet status structure changes in IPA v5.0 in ways that are difficult to represent cleanly. As a small step toward redefining it as a parsed block of data, use a constant to define its size, rather than the size of the IPA status structure type. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27net: ipa: refactor status buffer parsingAlex Elder
The packet length encoded in an IPA packet status buffer is computed more than once in ipa_endpoint_status_parse(). It is also checked again in ipa_endpoint_status_skip(), which that function calls. Compute the length once, and use that computed value later rather than recomputing it. Check for it being zero in the parse function rather than in ipa_endpoint_status_skip(). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27ALSA: memalloc: Workaround for Xen PVTakashi Iwai
We change recently the memalloc helper to use dma_alloc_noncontiguous() and the fallback to get_pages(). Although lots of issues with IOMMU (or non-IOMMU) have been addressed, but there seems still a regression on Xen PV. Interestingly, the only proper way to work is use dma_alloc_coherent(). The use of dma_alloc_coherent() for SG buffer was dropped as it's problematic on IOMMU systems. OTOH, Xen PV has a different way, and it's fine to use the dma_alloc_coherent(). This patch is a workaround for Xen PV. It consists of the following changes: - For Xen PV, use only the fallback allocation without dma_alloc_noncontiguous() - In the fallback allocation, use dma_alloc_coherent(); the DMA address from dma_alloc_coherent() is returned in get_addr ops - The DMA addresses are stored in an array; the first entry stores the number of allocated pages in lower bits, which are referred at releasing pages again Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Fixes: a8d302a0b770 ("ALSA: memalloc: Revive x86-specific WC page allocations again") Fixes: 9736a325137b ("ALSA: memalloc: Don't fall back for SG-buffer with IOMMU") Link: https://lore.kernel.org/r/87tu256lqs.wl-tiwai@suse.de Link: https://lore.kernel.org/r/20230125153104.5527-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-01-26net: dsa: ocelot: build felix.c into a dedicated kernel moduleVladimir Oltean
The build system currently complains: scripts/Makefile.build:252: drivers/net/dsa/ocelot/Makefile: felix.o is added to multiple modules: mscc_felix mscc_seville Since felix.c holds the DSA glue layer, create a mscc_felix_dsa_lib.ko. This is similar to how mscc_ocelot_switch_lib.ko holds a library for configuring the hardware. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Colin Foster <colin.foster@in-advantage.com> Link: https://lore.kernel.org/r/20230125145716.271355-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26Merge branch '40GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== virtchnl: update and refactor Jesse Brandeburg says: The virtchnl.h file is used by i40e/ice physical function (PF) drivers and irdma when talking to the iavf driver. This series cleans up the header file by removing unused elements, adding/cleaning some comments, fixing the data structures so they are explicitly defined, including padding, and finally does a long overdue rename of the IWARP members in the structures to RDMA, since the ice driver and it's associated Intel Ethernet E800 series adapters support both RDMA and IWARP. The whole series should result in no functional change, but hopefully clearer code. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: virtchnl: i40e/iavf: rename iwarp to rdma virtchnl: do structure hardening virtchnl: update header and increase header clarity virtchnl: remove unused structure declaration ==================== Link: https://lore.kernel.org/r/20230125212441.4030014-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26bpf: Fix the kernel crash caused by bpf_setsockopt().Kui-Feng Lee
The kernel crash was caused by a BPF program attached to the "lsm_cgroup/socket_sock_rcv_skb" hook, which performed a call to `bpf_setsockopt()` in order to set the TCP_NODELAY flag as an example. Flags like TCP_NODELAY can prompt the kernel to flush a socket's outgoing queue, and this hook "lsm_cgroup/socket_sock_rcv_skb" is frequently triggered by softirqs. The issue was that in certain circumstances, when `tcp_write_xmit()` was called to flush the queue, it would also allow BH (bottom-half) to run. This could lead to our program attempting to flush the same socket recursively, which caused a `skbuff` to be unlinked twice. `security_sock_rcv_skb()` is triggered by `tcp_filter()`. This occurs before the sock ownership is checked in `tcp_v4_rcv()`. Consequently, if a bpf program runs on `security_sock_rcv_skb()` while under softirq conditions, it may not possess the lock needed for `bpf_setsockopt()`, thus presenting an issue. The patch fixes this issue by ensuring that a BPF program attached to the "lsm_cgroup/socket_sock_rcv_skb" hook is not allowed to call `bpf_setsockopt()`. The differences from v1 are - changing commit log to explain holding the lock of the sock, - emphasizing that TCP_NODELAY is not the only flag, and - adding the fixes tag. v1: https://lore.kernel.org/bpf/20230125000244.1109228-1-kuifeng@meta.com/ Signed-off-by: Kui-Feng Lee <kuifeng@meta.com> Fixes: 9113d7e48e91 ("bpf: expose bpf_{g,s}etsockopt to lsm cgroup") Link: https://lore.kernel.org/r/20230127001732.4162630-1-kuifeng@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>