summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-04-30s390/string: Remove strcpy() implementationHeiko Carstens
Remove the optimized strcpy() library implementation. This doesn't make any difference since gcc recognizes all strcpy() usages anyway and uses the builtin variant. There is not a single branch to strcpy() within the generated kernel image, which also seems to be the reason why most other architectures don't have a strcpy() implementation anymore. Reviewed-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/con3270: Use strscpy() instead of strcpy()Heiko Carstens
Use strscpy() instead of strcpy() so that bounds checking is performed on the destination buffer. This requires to keep track of the size of the dynamically allocated prompt memory area, which is done with a new prompt_sz within struct tty3270. Reviewed-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/boot: Use strspcy() instead of strcpy()Heiko Carstens
Convert all strcpy() usages to strscpy(). strcpy() is deprecated since it performs no bounds checking on the destination buffer. Reviewed-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390: Simple strcpy() to strscpy() conversionsHeiko Carstens
Convert all strcpy() usages to strscpy() where the conversion means just replacing strcpy() with strscpy(). strcpy() is deprecated since it performs no bounds checking on the destination buffer. Reviewed-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30Merge branch 'zcrypt-no-alloc'Heiko Carstens
Harald Freudenberger says: ==================== This series of patches has the goal to open up a do-not-allocate memory path from the callers of the pkey in-kernel api down to the crypto cards and back. The asynch in-kernel cipher implementations (and the s390 PAES cipher implementations are one of them) may be called in a context where memory allocations which trigger IO is not acceptable. So this patch series reworks the AP bus code, the zcrypt layer, the pkey layer and the pkey handlers to respect this situation by processing a new parameter xflags (execution hints flags). There is a flag PKEY_XFLAG_NOMEMALLOC which tells the code to not allocate memory which may lead to IO operations. To reach this goal, the actual code changes have been differed. The zcrypt misc functions which need memory for cprb build use a pre allocated memory pool for this purpose. The findcard() functions have one temp memory area preallocated and protected with a mutex. Some smaller data is not allocated any more but went to the stack instead. The AP bus also uses a pre-allocated memory pool for building AP message requests. Note that the PAES implementation still needs to get reworked to run the protected key derivation in a real asynchronous way. However, this rework of AP bus, zcrypt and pkey is the base work required before reconsidering the PAES implementation. The patch series starts bottom (AP bus) and goes up the call chain (PKEY). At any time in the patch stack it should compile. For easier review I tried to have one logic code change by each patch and thus keep the patches "small". ==================== Link: https://lore.kernel.org/r/20250424133619.16495-1-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/pkey/crypto: Introduce xflags param for pkey in-kernel APIHarald Freudenberger
Add a new parameter xflags to the in-kernel API function pkey_key2protkey(). Currently there is only one flag supported: * PKEY_XFLAG_NOMEMALLOC: If this flag is given in the xflags parameter, the pkey implementation is not allowed to allocate memory but instead should fall back to use preallocated memory or simple fail with -ENOMEM. This flag is for protected key derive within a cipher or similar which must not allocate memory which would cause io operations - see also the CRYPTO_ALG_ALLOCATES_MEMORY flag in crypto.h. The one and only user of this in-kernel API - the skcipher implementations PAES in paes_s390.c set this flag upon request to derive a protected key from the given raw key material. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-26-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/pkey: Provide and pass xflags within pkey and zcrypt layersHarald Freudenberger
Provide and pass the xflag parameter from pkey ioctls through the pkey handler and further down to the implementations (CCA, EP11, PCKMO and UV). So all the code is now prepared and ready to support xflags ("execution flag"). The pkey layer supports the xflag PKEY_XFLAG_NOMEMALLOC: If this flag is given in the xflags parameter, the pkey implementation is not allowed to allocate memory but instead should fall back to use preallocated memory or simple fail with -ENOMEM. This flag is for protected key derive within a cipher or similar which must not allocate memory which would cause io operations - see also the CRYPTO_ALG_ALLOCATES_MEMORY flag in crypto.h. Within the pkey handlers this flag is then to be translated to appropriate zcrypt xflags before any zcrypt related functions are called. So the PKEY_XFLAG_NOMEMALLOC translates to ZCRYPT_XFLAG_NOMEMALLOC - If this flag is set, no memory allocations which may trigger any IO operations are done. The pkey in-kernel pkey API still does not provide this xflag param. That's intended to come with a separate patch which enables this functionality. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-25-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/uv: Remove uv_get_secret_metadata functionHarald Freudenberger
The uv_get_secret_metadata() in-kernel function was only offered and used by the pkey uv handler. Remove it as there is no customer any more. Suggested-by: Steffen Eiden <seiden@linux.ibm.com> Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Acked-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-24-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/pkey: Use preallocated memory for retrieve of UV secret metadataHarald Freudenberger
The pkey uv functions may be called in a situation where memory allocations which trigger IO operations are not allowed. An example: decryption of the swap partition with protected key (PAES). The pkey uv code takes care of this by holding one preallocated struct uv_secret_list to be used with the new UV function uv_find_secret(). The older function uv_get_secret_metadata() used before always allocates/frees an ephemeral memory buffer. The preallocated struct is concurrency protected by a mutex. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-23-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/uv: Rename find_secret() to uv_find_secret() and publishHarald Freudenberger
Rename the internal UV function find_secret() to uv_find_secret() and publish it as new UV API in-kernel function. The pkey uv handler may be called in a do-not-allocate memory situation where sleeping is allowed but allocating memory which may cause IO operations is not. For example when an encrypted swap file is used and the encryption is done via UV retrievable secrets with protected keys. The UV API function uv_get_secret_metadata() allocates memory and then calls the find_secret() function. By exposing the find_secret() function as a new UV API function uv_find_secret() it is possible to retrieve UV secret meta data without any memory allocations from the UV when the caller offers space for one struct uv_secret_list. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Acked-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-22-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/pkey: Rework EP11 pkey handler to use stack for small memory allocsHarald Freudenberger
There have been some places in the EP11 handler code where relatively small amounts of memory have been allocated an freed at the end of the function. This code has been reworked to use the stack instead. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-21-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/pkey: Rework CCA pkey handler to use stack for small memory allocsHarald Freudenberger
There have been some places in the CCA handler code where relatively small amounts of memory have been allocated an freed at the end of the function. This code has been reworked to use the stack instead. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-20-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/zcrypt: Rework ep11 misc functions to use cprb mempoolHarald Freudenberger
There are two places in the ep11 misc code where a short term memory buffer is needed. Rework this code to use the cprb mempool to satisfy this ephemeral memory requirements. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-19-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/zcrypt: Locate ep11_domain_query_info onto the stack instead of kmallocHarald Freudenberger
Locate the relative small struct ep11_domain_query_info variable onto the stack instead of kmalloc()/kfree(). Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-18-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/zcrypt: Propagate xflags argument with cca_get_info()Harald Freudenberger
Propagate the xflags argument from the cca_get_info() caller down to the lower level functions for proper memory allocation hints. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-17-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/zcrypt: Rework cca misc functions kmallocs to use the cprb mempoolHarald Freudenberger
Rework two places in the zcrypt cca misc code using kmalloc() for ephemeral memory allocation. As there is anyway now a cprb mempool let's use this pool instead to satisfy these short term memory allocations. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-16-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/zcrypt: Rework ep11 findcard() implementation and callersHarald Freudenberger
Rework the memory usage of the ep11 findcard() implementation: - findcard does not allocate memory for the list of apqns any more. - the callers are now responsible to provide an array of apqns to store the matching apqns into. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-15-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/zcrypt: Rework cca findcard() implementation and callersHarald Freudenberger
Rework the memory usage of the cca findcard() implementation: - findcard does not allocate memory for the list of apqns any more. - the callers are now responsible to provide an array of apqns to store the matching apqns into. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-14-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/zcrypt: Remove CCA and EP11 card and domain info cachesHarald Freudenberger
Remove the caching of the CCA and EP11 card and domain info. In nearly all places where the card or domain info is fetched the verify param was enabled and thus the cache was bypassed. The only real place where info from the cache was used was in the sysfs pseudo files in cases where the card/queue was switched to "offline". All other callers insisted on getting fresh info and thus a communication to the card was enforced. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-13-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/zcrypt: Remove unused functions from cca miscHarald Freudenberger
The static function findcard() and the zcrypt cca_findcard() function are both not used any more. Remove this outdated code and an internal function only called by these. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-12-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/zcrypt: Introduce pre-allocated device status array for ep11 miscHarald Freudenberger
Introduce a pre-allocated device status array memory together with a mutex controlling the occupation to be used by the findcard() function. Limit the device status array to max 128 cards and max 128 domains to reduce the size of this pre-allocated memory to 64 KB. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-11-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/zcrypt: Introduce pre-allocated device status array for cca miscHarald Freudenberger
Introduce a pre-allocated device status array memory together with a mutex controlling the occupation to be used by the findcard2() function. Limit the device status array to max 128 cards and max 128 domains to reduce the size of this pre-allocated memory to 64 KB. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-10-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/zcrypt: Rework zcrypt function zcrypt_device_status_mask_extHarald Freudenberger
Rework the existing function zcrypt_device_status_mask_ext(): Add two new parameters to provide upper limits for cards and queues. The existing implementation needed an array of 256 * 256 * 4 = 256 KB which is really huge. The reworked function is more flexible in the sense that the caller can decide the upper limit for cards and domains to be stored into the status array. So for example a caller may decide to only query for cards 0...127 and queues 0...127 and thus only an array of size 128 * 128 * 4 = 64 KB is needed. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-9-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/zcrypt: Introduce cprb mempool for ep11 misc functionsHarald Freudenberger
Introduce a cprb mempool for the zcrypt ep11 misc functions (zcrypt_ep11misc.*) do some preparation rework to support a do-not-allocate path through some zcrypt ep11 misc functions. The mempool is controlled by the zcrypt module parameter "mempool_threshold" which shall control the minimal amount of memory items for CCA and EP11. The mempool shall support "mempool_threshold" requests/replies in parallel which means for EP11 to hold a send and receive buffer memory per request. Each of this cprb space items is limited to 8 KB. So by default the mempool consumes 5 * 2 * 8KB = 80KB If the mempool is depleted upon one ep11 misc functions is called with the ZCRYPT_XFLAG_NOMEMALLOC xflag set, the function will fail with -ENOMEM and the caller is responsible for taking further actions. This is only part of an rework to support a new xflag ZCRYPT_XFLAG_NOMEMALLOC but not yet complete. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-8-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/zcrypt: Introduce cprb mempool for cca misc functionsHarald Freudenberger
Introduce a new module parameter "zcrypt_mempool_threshold" for the zcrypt module. This parameter controls the minimal amount of mempool items which are pre-allocated for urgent requests/replies and will be used with the support for the new xflag ZCRYPT_XFLAG_NOMEMALLOC. The default value of 5 shall provide enough memory items to support up to 5 requests (and their associated reply) in parallel. The minimum value is 1 and is checked in zcrypt module init(). If the mempool is depleted upon one cca misc functions is called with the named xflag set, the function will fail with -ENOMEM and the caller is responsible for taking further actions. For CCA each mempool item is 16KB, as a CCA CPRB needs to hold the request and the reply. The pool items only support requests/replies with a limit of about 8KB. So by default the CCA mempool consumes 5 * 16KB = 80KB This is only part of an rework to support a new xflag ZCRYPT_XFLAG_NOMEMALLOC but not yet complete. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-7-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/ap/zcrypt: New xflag parameterHarald Freudenberger
Introduce a new flag parameter for the both cprb send functions zcrypt_send_cprb() and zcrypt_send_ep11_cprb(). This new xflags parameter ("execution flags") shall be used to provide execution hints and flags for this crypto request. There are two flags implemented to be used with these functions: * ZCRYPT_XFLAG_USERSPACE - indicates to the lower layers that all the ptrs address userspace. So when construction the ap msg copy_from_user() is to be used. If this flag is NOT set, the ptrs address kernel memory and thus memcpy() is to be used. * ZCRYPT_XFLAG_NOMEMALLOC - indicates that this task must not allocate memory which may be allocated with io operations. For the AP bus and zcrypt message layer this means: * The ZCRYPT_XFLAG_USERSPACE is mapped to the already existing bool variable "userspace" which is propagated to the zcrypt proto implementations. * The ZCRYPT_XFLAG_NOMEMALLOC results in setting the AP flag AP_MSG_FLAG_MEMPOOL when the AP msg buffer is initialized. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-6-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/zcrypt: Avoid alloc and copy of ep11 targets if kernelspace cprbHarald Freudenberger
If there is a target list of APQNs given when an CPRB is to be send via zcrypt_send_ep11_cprb() there is always a kmalloc() done and the targets are copied via z_copy_from_user. As there are callers from kernel space (zcrypt_ep11misc.c) which signal this via the userspace parameter improve this code to directly use the given target list in case of kernelspace thus removing the unnecessary memory alloc and mem copy. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-5-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/ap: Introduce ap message buffer poolHarald Freudenberger
There is a need for a do-not-allocate-memory path through the AP bus layer. The pkey layer may be triggered via the in-kernel interface from a protected key crypto algorithm (namely PAES) to convert a secure key into a protected key. This happens in a workqueue context, so sleeping is allowed but memory allocations causing IO operations are not permitted. To accomplish this, an AP message memory pool with pre-allocated space is established. When ap_init_apmsg() with use_mempool set to true is called, instead of kmalloc() the ap message buffer is allocated from the ap_msg_pool. This pool only holds a limited amount of buffers: ap_msg_pool_min_items with the item size AP_DEFAULT_MAX_MSG_SIZE and exactly one of these items (if available) is returned if ap_init_apmsg() with the use_mempool arg set to true is called. When this pool is exhausted and use_mempool is set true, ap_init_apmsg() returns -ENOMEM without any attempt to allocate memory and the caller has to deal with that. Default values for this mempool of ap messages is: * Each buffer is 12KB (that is the default AP bus size and all the urgent messages should fit into this space). * Minimum items held in the pool is 8. This value is adjustable via module parameter ap.msgpool_min_items. The zcrypt layer may use this flag to indicate to the ap bus that the processing path for this message should not allocate memory but should use pre-allocated memory buffer instead. This is to prevent deadlocks with crypto and io for example with encrypted swap volumes. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-4-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/ap/zcrypt: Rework AP message buffer allocationHarald Freudenberger
Slight rework on the way how AP message buffers are allocated. Instead of having multiple places with kmalloc() calls all the AP message buffers are now allocated and freed on exactly one place: ap_init_apmsg() allocates the current AP bus max limit of ap_max_msg_size (defaults to 12KB). The AP message buffer is then freed in ap_release_apmsg(). Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-3-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/ap: Move response_type struct into ap_msg structHarald Freudenberger
Move the very small response_type struct into struct ap_msg. So there is no need to kmalloc this tiny struct with each ap message preparation. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133619.16495-2-freude@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-30s390/cpumf: Adjust number of leading zeroes for z15 attributesThomas Richter
In CPUMF attribute definitions for z15 all CPUMF attributes have configuration values of the form 0x0[0-9a-f]{3} . However 2 defines do not match this scheme, they have two leading zeroes instead of one. Adjust this. No functional change. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-17s390: Remove optional third argument of strscpy() if possibleHeiko Carstens
The third argument of strscpy() is optional and can be left away iff the destination is an array and the maximum size of the copy is the size of destination. Remove the third argument for those cases where this is possible. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-17s390/ipl: Rename and change strncpy_skip_quote()Heiko Carstens
Rename strncpy_skip_quote() to strscpy_skip_quote() and change its implementation so that the destination string is always NUL terminated. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-17s390/string: Remove optimized strncpy()Heiko Carstens
There are hardly any strncpy() users left, therefore drop the optimized s390 variant. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-16watchdog: diag288_wdt: Implement module autoloadHeiko Carstens
The s390 specific diag288_wdt watchdog driver makes use of the virtual watchdog timer, which is available in most machine configurations. If executing the diagnose instruction with subcode 0x288 results in an exception the watchdog timer is not available, otherwise it is available. In order to allow module autoload of the diag288_wdt module, move the detection of the virtual watchdog timer to early boot code, and provide its availability as a cpu feature. This allows to make use of module_cpu_feature_match() to automatically load the module iff the virtual watchdog timer is available. Suggested-by: Marc Hartmayer <mhartmay@linux.ibm.com> Tested-by: Mete Durlu <meted@linux.ibm.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20250410095036.1525057-1-hca@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-14s390/boot: Replace strncpy() with strscpy()Vasily Gorbik
Replace the last 2 usages of strncpy() in s390 code with strscpy(). Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-14s390/boot: Add sized_strscpy() to enable strscpy() usageVasily Gorbik
Add a simple sized_strscpy() implementation to allow the use of strscpy() in the decompressor. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-14s390/mm: Select ARCH_WANT_IRQS_OFF_ACTIVATE_MMHeiko Carstens
Select ARCH_WANT_IRQS_OFF_ACTIVATE_MM so that activate_mm() is called with irqs disabled. This allows to call switch_mm_irqs_off() instead of switch_mm() and saves two local_irq_save() / local_irq_restore() pairs. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-14s390/mm: Reimplement lazy ASCE handlingHeiko Carstens
Reduce system call overhead time (round trip time for invoking a non-existent system call) by 25%. With the removal of set_fs() [1] lazy control register handling was removed in order to keep kernel entry and exit simple. However this made system calls slower. With the conversion to generic entry [2] and numerous follow up changes which simplified the entry code significantly, adding support for lazy asce handling doesn't add much complexity to the entry code anymore. In particular this means: - On kernel entry the primary asce is not modified and contains the user asce - Kernel accesses which require secondary-space mode (for example futex operations) are surrounded by enable_sacf_uaccess() and disable_sacf_uaccess() calls. enable_sacf_uaccess() sets the primary asce to kernel asce so that the sacf instruction can be used to switch to secondary-space mode. The primary asce is changed back to user asce with disable_sacf_uaccess(). The state of the control register which contains the primary asce is reflected with a new TIF_ASCE_PRIMARY bit. This is required on context switch so that the correct asce is restored for the scheduled in process. In result address spaces are now setup like this: CPU running in | %cr1 ASCE | %cr7 ASCE | %cr13 ASCE -----------------------------|-----------|-----------|----------- user space | user | user | kernel kernel (no sacf) | user | user | kernel kernel (during sacf uaccess) | kernel | user | kernel kernel (kvm guest execution) | guest | user | kernel In result cr1 control register content is not changed except for: - futex system calls - legacy s390 PCI system calls - the kvm specific cmpxchg_user_key() uaccess helper This leads to faster system call execution. [1] 87d598634521 ("s390/mm: remove set_fs / rework address space handling") [2] 56e62a737028 ("s390: convert to generic entry") Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2025-04-13Linux 6.15-rc2v6.15-rc2Linus Torvalds
2025-04-13Merge tag 'erofs-for-6.15-rc2-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fixes from Gao Xiang: - Properly handle errors when file-backed I/O fails - Fix compilation issues on ARM platform (arm-linux-gnueabi) - Fix parsing of encoded extents - Minor cleanup * tag 'erofs-for-6.15-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: remove duplicate code erofs: fix encoded extents handling erofs: add __packed annotation to union(__le16..) erofs: set error to bio if file-backed IO fails
2025-04-13Merge tag 'ext4_for_linus-6.15-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "A few more miscellaneous ext4 bug fixes and cleanups including some syzbot failures and fixing a stale file handing refeencing an inode previously used as a regular file, but which has been deleted and reused as an ea_inode would result in ext4 erroneously considering this a case of fs corruption" * tag 'ext4_for_linus-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix off-by-one error in do_split ext4: make block validity check resistent to sb bh corruption ext4: avoid -Wflex-array-member-not-at-end warning Documentation: ext4: Add fields to ext4_super_block documentation ext4: don't treat fhandle lookup of ea_inode as FS corruption
2025-04-13Merge tag 'fixes-2025-04-13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock fix from Mike Rapoport: "Fix build of memblock test. Add missing stubs for mutex and free_reserved_area() to memblock tests" * tag 'fixes-2025-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: memblock tests: Fix mutex related build error
2025-04-12ext4: fix off-by-one error in do_splitArtem Sadovnikov
Syzkaller detected a use-after-free issue in ext4_insert_dentry that was caused by out-of-bounds access due to incorrect splitting in do_split. BUG: KASAN: use-after-free in ext4_insert_dentry+0x36a/0x6d0 fs/ext4/namei.c:2109 Write of size 251 at addr ffff888074572f14 by task syz-executor335/5847 CPU: 0 UID: 0 PID: 5847 Comm: syz-executor335 Not tainted 6.12.0-rc6-syzkaller-00318-ga9cda7c0ffed #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/30/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 kasan_check_range+0x282/0x290 mm/kasan/generic.c:189 __asan_memcpy+0x40/0x70 mm/kasan/shadow.c:106 ext4_insert_dentry+0x36a/0x6d0 fs/ext4/namei.c:2109 add_dirent_to_buf+0x3d9/0x750 fs/ext4/namei.c:2154 make_indexed_dir+0xf98/0x1600 fs/ext4/namei.c:2351 ext4_add_entry+0x222a/0x25d0 fs/ext4/namei.c:2455 ext4_add_nondir+0x8d/0x290 fs/ext4/namei.c:2796 ext4_symlink+0x920/0xb50 fs/ext4/namei.c:3431 vfs_symlink+0x137/0x2e0 fs/namei.c:4615 do_symlinkat+0x222/0x3a0 fs/namei.c:4641 __do_sys_symlink fs/namei.c:4662 [inline] __se_sys_symlink fs/namei.c:4660 [inline] __x64_sys_symlink+0x7a/0x90 fs/namei.c:4660 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f </TASK> The following loop is located right above 'if' statement. for (i = count-1; i >= 0; i--) { /* is more than half of this entry in 2nd half of the block? */ if (size + map[i].size/2 > blocksize/2) break; size += map[i].size; move++; } 'i' in this case could go down to -1, in which case sum of active entries wouldn't exceed half the block size, but previous behaviour would also do split in half if sum would exceed at the very last block, which in case of having too many long name files in a single block could lead to out-of-bounds access and following use-after-free. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Cc: stable@vger.kernel.org Fixes: 5872331b3d91 ("ext4: fix potential negative array index in do_split()") Signed-off-by: Artem Sadovnikov <a.sadovnikov@ispras.ru> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250404082804.2567-3-a.sadovnikov@ispras.ru Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-04-12ext4: make block validity check resistent to sb bh corruptionOjaswin Mujoo
Block validity checks need to be skipped in case they are called for journal blocks since they are part of system's protected zone. Currently, this is done by checking inode->ino against sbi->s_es->s_journal_inum, which is a direct read from the ext4 sb buffer head. If someone modifies this underneath us then the s_journal_inum field might get corrupted. To prevent against this, change the check to directly compare the inode with journal->j_inode. **Slight change in behavior**: During journal init path, check_block_validity etc might be called for journal inode when sbi->s_journal is not set yet. In this case we now proceed with ext4_inode_block_valid() instead of returning early. Since systems zones have not been set yet, it is okay to proceed so we can perform basic checks on the blocks. Suggested-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://patch.msgid.link/0c06bc9ebfcd6ccfed84a36e79147bf45ff5adc1.1743142920.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-04-12ext4: avoid -Wflex-array-member-not-at-end warningGustavo A. R. Silva
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. Use the `DEFINE_RAW_FLEX()` helper for an on-stack definition of a flexible structure where the size of the flexible-array member is known at compile-time, and refactor the rest of the code, accordingly. So, with these changes, fix the following warning: fs/ext4/mballoc.c:3041:40: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Link: https://patch.msgid.link/Z-SF97N3AxcIMlSi@kspp Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-04-12Documentation: ext4: Add fields to ext4_super_block documentationTom Vierjahn
Documentation and implementation of the ext4 super block have slightly diverged: Padding has been removed in order to make room for new fields that are still missing in the documentation. Add the new fields s_encryption_level, s_first_error_errorcode, s_last_error_errorcode to the documentation of the ext4 super block. Fixes: f542fbe8d5e8 ("ext4 crypto: reserve codepoints used by the ext4 encryption feature") Fixes: 878520ac45f9 ("ext4: save the error code which triggered an ext4_error() in the superblock") Signed-off-by: Tom Vierjahn <tom.vierjahn@acm.org> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://patch.msgid.link/20250324221004.5268-1-tom.vierjahn@acm.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-04-12Merge tag 'trace-v6.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Hide get_vm_area() from MMUless builds The function get_vm_area() is not defined when CONFIG_MMU is not defined. Hide that function within #ifdef CONFIG_MMU. - Fix output of synthetic events when they have dynamic strings The print fmt of the synthetic event's format file use to have "%.*s" for dynamic size strings even though the user space exported arguments had only __get_str() macro that provided just a nul terminated string. This was fixed so that user space could parse this properly. But the reason that it had "%.*s" was because internally it provided the maximum size of the string as one of the arguments. The fix that replaced "%.*s" with "%s" caused the trace output (when the kernel reads the event) to write "(efault)" as it would now read the length of the string as "%s". As the string provided is always nul terminated, there's no reason for the internal code to use "%.*s" anyway. Just remove the length argument to match the "%s" that is now in the format. - Fix the ftrace subops hash logic of the manager ops hash The function_graph uses the ftrace subops code. The subops code is a way to have a single ftrace_ops registered with ftrace to determine what functions will call the ftrace_ops callback. More than one user of function graph can register a ftrace_ops with it. The function graph infrastructure will then add this ftrace_ops as a subops with the main ftrace_ops it registers with ftrace. This is because the functions will always call the function graph callback which in turn calls the subops ftrace_ops callbacks. The main ftrace_ops must add a callback to all the functions that the subops want a callback from. When a subops is registered, it will update the main ftrace_ops hash to include the functions it wants. This is the logic that was broken. The ftrace_ops hash has a "filter_hash" and a "notrace_hash" where all the functions in the filter_hash but not in the notrace_hash are attached by ftrace. The original logic would have the main ftrace_ops filter_hash be a union of all the subops filter_hashes and the main notrace_hash would be a intersect of all the subops filter hashes. But this was incorrect because the notrace hash depends on the filter_hash it is associated to and not the union of all filter_hashes. Instead, when a subops is added, just include all the functions of the subops hash that are in its filter_hash but not in its notrace_hash. The main subops hash should not use its notrace hash, unless all of its subops hashes have an empty filter_hash (which means to attach to all functions), and then, and only then, the main ftrace_ops notrace hash can be the intersect of all the subops hashes. This not only fixes the bug, but also simplifies the code. - Add a selftest to better test the subops filtering Add a selftest that would catch the bug fixed by the above change. - Fix extra newline printed in function tracing with retval The function parameter code changed the output logic slightly and called print_graph_retval() and also printed a newline. The print_graph_retval() also prints a newline which caused blank lines to be printed in the function graph tracer when retval was added. This caused one of the selftests to fail if retvals were enabled. Instead remove the new line output from print_graph_retval() and have the callers always print the new line so that it doesn't have to do special logic if it calls print_graph_retval() or not. - Fix out-of-bound memory access in the runtime verifier When rv_is_container_monitor() is called on the last entry on the link list it references the next entry, which is the list head and causes an out-of-bound memory access. * tag 'trace-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rv: Fix out-of-bound memory access in rv_is_container_monitor() ftrace: Do not have print_graph_retval() add a newline tracing/selftest: Add test to better test subops filtering of function graph ftrace: Fix accounting of subop hashes ftrace: Properly merge notrace hashes tracing: Do not add length to print format in synthetic events tracing: Hide get_vm_area() from MMUless builds
2025-04-12Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfLinus Torvalds
Pull bpf fixes from Alexei Starovoitov: - Followup fixes for resilient spinlock (Kumar Kartikeya Dwivedi): - Make res_spin_lock test less verbose, since it was spamming BPF CI on failure, and make the check for AA deadlock stronger - Fix rebasing mistake and use architecture provided res_smp_cond_load_acquire - Convert BPF maps (queue_stack and ringbuf) to resilient spinlock to address long standing syzbot reports - Make sure that classic BPF load instruction from SKF_[NET|LL]_OFF offsets works when skb is fragmeneted (Willem de Bruijn) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Convert ringbuf map to rqspinlock bpf: Convert queue_stack map to rqspinlock bpf: Use architecture provided res_smp_cond_load_acquire selftests/bpf: Make res_spin_lock AA test condition stronger selftests/net: test sk_filter support for SKF_NET_OFF on frags bpf: support SKF_NET_OFF and SKF_LL_OFF on skb frags selftests/bpf: Make res_spin_lock test less verbose
2025-04-12rv: Fix out-of-bound memory access in rv_is_container_monitor()Nam Cao
When rv_is_container_monitor() is called on the last monitor in rv_monitors_list, KASAN yells: BUG: KASAN: global-out-of-bounds in rv_is_container_monitor+0x101/0x110 Read of size 8 at addr ffffffff97c7c798 by task setup/221 The buggy address belongs to the variable: rv_monitors_list+0x18/0x40 This is due to list_next_entry() is called on the last entry in the list. It wraps around to the first list_head, and the first list_head is not embedded in struct rv_monitor_def. Fix it by checking if the monitor is last in the list. Cc: stable@vger.kernel.org Cc: Gabriele Monaco <gmonaco@redhat.com> Fixes: cb85c660fcd4 ("rv: Add option for nested monitors and include sched") Link: https://lore.kernel.org/e85b5eeb7228bfc23b8d7d4ab5411472c54ae91b.1744355018.git.namcao@linutronix.de Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>