summaryrefslogtreecommitdiff
path: root/arch/powerpc/kernel/vdso/getrandom.S
AgeCommit message (Collapse)Author
2024-10-16powerpc/vdso: Implement __arch_get_vdso_rng_data()Christophe Leroy
VDSO time functions do not call any other function, so they don't need to save/restore LR. However, retrieving the address of VDSO data page requires using LR hence saving then restoring it, which can be heavy on some CPUs. On the other hand, VDSO functions on powerpc are not standard functions and require a wrapper function to call C VDSO functions. And that wrapper has to save and restore LR in order to call the C VDSO function, so retrieving VDSO data page address in that wrapper doesn't require additional save/restore of LR. For random VDSO functions it is a bit different. Because the function calls __arch_chacha20_blocks_nostack(), it saves and restores LR. Retrieving VDSO data page address can then be done there without additional save/restore of LR. So lets implement __arch_get_vdso_rng_data() and simplify the wrapper. It starts paving the way for the day powerpc will implement a more standard ABI for VDSO functions. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/a1a9bd0df508f1b5c04684b7366940577dfc6262.1727858295.git.christophe.leroy@csgroup.eu
2024-10-16powerpc/vdso: Add a page for non-time dataChristophe Leroy
The page containing VDSO time data is swapped with the one containing TIME namespace data when a process uses a non-root time namespace. For other data like powerpc specific data and RNG data, it means tracking whether time namespace is the root one or not to know which page to use. Simplify the logic behind by moving time data out of first data page so that the first data page which contains everything else always remains the first page. Time data is in the second or third page depending on selected time namespace. While we are playing with get_datapage macro, directly take into account the data offset inside the macro instead of adding that offset afterwards. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/0557d3ec898c1d0ea2fc59fa8757618e524c5d94.1727858295.git.christophe.leroy@csgroup.eu
2024-09-13powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO64Christophe Leroy
Extend getrandom() vDSO implementation to VDSO64. Tested on QEMU on both ppc64_defconfig and ppc64le_defconfig. Results from a Power9 (PowerNV): ~ # ./vdso_test_getrandom bench-single    vdso: 25000000 times in 0.787943615 seconds    libc: 25000000 times in 14.101887252 seconds    syscall: 25000000 times in 14.047475082 seconds Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Madhavan Srinivasan <maddy@linux.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO32Christophe Leroy
To be consistent with other VDSO functions, the function is called __kernel_getrandom() __arch_chacha20_blocks_nostack() fonction is implemented basically with 32 bits operations. It performs 4 QUARTERROUND operations in parallele. There are enough registers to avoid using the stack: On input: r3: output bytes r4: 32-byte key input r5: 8-byte counter input/output r6: number of 64-byte blocks to write to output During operation: stack: pointer to counter (r5) and non-volatile registers (r14-131) r0: counter of blocks (initialised with r6) r4: Value '4' after key has been read, used for indexing r5-r12: key r14-r15: block counter r16-r31: chacha state At the end: r0, r6-r12: Zeroised r5, r14-r31: Restored Performance on powerpc 885 (using kernel selftest): ~# ./vdso_test_getrandom bench-single vdso: 25000000 times in 62.938002291 seconds libc: 25000000 times in 535.581916866 seconds syscall: 25000000 times in 531.525042806 seconds Performance on powerpc 8321 (using kernel selftest): ~# ./vdso_test_getrandom bench-single vdso: 25000000 times in 16.899318858 seconds libc: 25000000 times in 131.050596522 seconds syscall: 25000000 times in 129.794790389 seconds This first patch adds support for VDSO32. As selftests cannot easily be generated only for VDSO32, and because the following patch brings support for VDSO64 anyway, this patch opts out all code in __arch_chacha20_blocks_nostack() so that vdso_test_chacha will not fail to compile and will not crash on PPC64/PPC64LE, allthough the selftest itself will fail. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>