From c3d466cba1692708a19c6ff829d0386c83a0c6e5 Mon Sep 17 00:00:00 2001 From: Wilco Dijkstra Date: Mon, 12 Feb 2018 10:42:42 +0000 Subject: Remove slow paths from pow Remove the slow paths from pow. Like several other double precision math functions, pow is exactly rounded. This is not required from math functions and causes major overheads as it requires multiple fallbacks using higher precision arithmetic if a result is close to 0.5ULP. Ridiculous slowdowns of up to 100000x have been reported when the highest precision path triggers. All GLIBC math tests pass on AArch64 and x64 (with ULP of pow set to 1). The worst case error is ~0.506ULP. A simple test over a few hundred million values shows pow is 10% faster on average. This fixes BZ #13932. [BZ #13932] * sysdeps/ieee754/dbl-64/uexp.h (err_1): Remove. * benchtests/pow-inputs: Update comment for slow path cases. * manual/probes.texi (slowpow_p10): Delete removed probe. (slowpow_p10): Likewise. * math/Makefile: Remove halfulp.c and slowpow.c. * sysdeps/aarch64/libm-test-ulps: Set ULP of pow to 1. * sysdeps/generic/math_private.h (__exp1): Remove error argument. (__halfulp): Remove. (__slowpow): Remove. * sysdeps/i386/fpu/halfulp.c: Delete file. * sysdeps/i386/fpu/slowpow.c: Likewise. * sysdeps/ia64/fpu/halfulp.c: Likewise. * sysdeps/ia64/fpu/slowpow.c: Likewise. * sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove error argument, improve comments and add error analysis. * sysdeps/ieee754/dbl-64/e_pow.c (__ieee754_pow): Add error analysis. (power1): Remove function: (log1): Remove error argument, add error analysis. (my_log2): Remove function. * sysdeps/ieee754/dbl-64/halfulp.c: Delete file. * sysdeps/ieee754/dbl-64/slowpow.c: Likewise. * sysdeps/m68k/m680x0/fpu/halfulp.c: Likewise. * sysdeps/m68k/m680x0/fpu/slowpow.c: Likewise. * sysdeps/powerpc/power4/fpu/Makefile: Remove CPPFLAGS-slowpow.c. * sysdeps/x86_64/fpu/libm-test-ulps: Set ULP of pow to 1. * sysdeps/x86_64/fpu/multiarch/Makefile: Remove slowpow-fma.c, slowpow-fma4.c, halfulp-fma.c, halfulp-fma4.c. * sysdeps/x86_64/fpu/multiarch/e_pow-fma.c (__slowpow): Remove define. * sysdeps/x86_64/fpu/multiarch/e_pow-fma4.c (__slowpow): Likewise. * sysdeps/x86_64/fpu/multiarch/halfulp-fma.c: Delete file. * sysdeps/x86_64/fpu/multiarch/halfulp-fma4.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowpow-fma.c: Likewise. * sysdeps/x86_64/fpu/multiarch/slowpow-fma4.c: Likewise. --- benchtests/pow-inputs | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) (limited to 'benchtests/pow-inputs') diff --git a/benchtests/pow-inputs b/benchtests/pow-inputs index 78f8ac73d5..4a51aacb86 100644 --- a/benchtests/pow-inputs +++ b/benchtests/pow-inputs @@ -302,8 +302,7 @@ 0x1.c004d2256a5b8p402, -0x1.a01df480fdcb7p98 0x1.52b9d41aaa1e9p-589, -0x1.292cb15f1459dp46 -0x1.ea9ca6fa0919ep-279, -0x1.601e44b6a588cp40 -# pow slow path at 240 bits -# Implemented in sysdeps/ieee754/dbl-64/slowpow.c +# old pow slow path at 240 bits ## name: 240bits 0x1.01fcd33493ea3p596, -0x1.724bd4e887783p-14 0x1.032ff59ab34fdp-540, -0x1.61e3632080b87p-24 @@ -405,8 +404,7 @@ 0x1.fae913d4f952ep-809, -0x1.4b649402fce63p-6 0x1.fe6d725408f24p484, -0x1.25f4f6441d2e4p-12 0x1.ff6393f9150ccp-718, 0x1.a0cb50a9bf2f3p-31 -# pow slowest path at 768 bits -# Implemented in sysdeps/ieee754/dbl-64/slowpow.c +# old pow slowest path at 768 bits ## name: 768bits 1.0000000000000020, 1.5 0x1.006777b4b61dep843, -0x1.67e3145491872p-1 -- cgit v1.2.3