summaryrefslogtreecommitdiff
path: root/manual
diff options
context:
space:
mode:
authorDJ Delorie <dj@delorie.com>2017-07-06 13:37:30 -0400
committerDJ Delorie <dj@delorie.com>2017-07-06 13:37:30 -0400
commitd5c3fafc4307c9b7a4c7d5cb381fcdbfad340bcc (patch)
tree380cfbc329860434d6b29825bd02ba5f0c7d4b30 /manual
parent3cefdd7310a5d1fad45648d9346e47df9c185fdc (diff)
Add per-thread cache to malloc
* config.make.in: Enable experimental malloc option. * configure.ac: Likewise. * configure: Regenerate. * manual/install.texi: Document it. * INSTALL: Regenerate. * malloc/Makefile: Likewise. * malloc/malloc.c: Add per-thread cache (tcache). (tcache_put): New. (tcache_get): New. (tcache_thread_freeres): New. (tcache_init): New. (__libc_malloc): Use cached chunks if available. (__libc_free): Initialize tcache if needed. (__libc_realloc): Likewise. (__libc_calloc): Likewise. (_int_malloc): Prefill tcache when appropriate. (_int_free): Likewise. (do_set_tcache_max): New. (do_set_tcache_count): New. (do_set_tcache_unsorted_limit): New. * manual/probes.texi: Document new probes. * malloc/arena.c: Add new tcache tunables. * elf/dl-tunables.list: Likewise. * manual/tunables.texi: Document them. * NEWS: Mention the per-thread cache.
Diffstat (limited to 'manual')
-rw-r--r--manual/install.texi6
-rw-r--r--manual/probes.texi19
-rw-r--r--manual/tunables.texi32
3 files changed, 57 insertions, 0 deletions
diff --git a/manual/install.texi b/manual/install.texi
index 03eb2dd93b..b8deb9ceba 100644
--- a/manual/install.texi
+++ b/manual/install.texi
@@ -232,6 +232,12 @@ libnss_nisplus are not built at all.
Use this option to enable libnsl with all depending NSS modules and
header files.
+@item --disable-experimental-malloc
+By default, a per-thread cache is enabled in @code{malloc}. While
+this cache can be disabled on a per-application basis using tunables
+(set glibc.malloc.tcache_count to zero), this option can be used to
+remove it from the build completely.
+
@item --build=@var{build-system}
@itemx --host=@var{host-system}
These options are for cross-compiling. If you specify both options and
diff --git a/manual/probes.texi b/manual/probes.texi
index eb91c62703..96acaed206 100644
--- a/manual/probes.texi
+++ b/manual/probes.texi
@@ -231,6 +231,25 @@ dynamic brk/mmap thresholds. Argument @var{$arg1} and @var{$arg2} are
the adjusted mmap and trim thresholds, respectively.
@end deftp
+@deftp Probe memory_tunable_tcache_max_bytes (int @var{$arg1}, int @var{$arg2})
+This probe is triggered when the @code{glibc.malloc.tcache_max}
+tunable is set. Argument @var{$arg1} is the requested value, and
+@var{$arg2} is the previous value of this tunable.
+@end deftp
+
+@deftp Probe memory_tunable_tcache_count (int @var{$arg1}, int @var{$arg2})
+This probe is triggered when the @code{glibc.malloc.tcache_count}
+tunable is set. Argument @var{$arg1} is the requested value, and
+@var{$arg2} is the previous value of this tunable.
+@end deftp
+
+@deftp Probe memory_tunable_tcache_unsorted_limit (int @var{$arg1}, int @var{$arg2})
+This probe is triggered when the
+@code{glibc.malloc.tcache_unsorted_limit} tunable is set. Argument
+@var{$arg1} is the requested value, and @var{$arg2} is the previous
+value of this tunable.
+@end deftp
+
@node Mathematical Function Probes
@section Mathematical Function Probes
diff --git a/manual/tunables.texi b/manual/tunables.texi
index 9331b03702..b16d591b90 100644
--- a/manual/tunables.texi
+++ b/manual/tunables.texi
@@ -193,6 +193,38 @@ systems the limit is twice the number of cores online and on 64-bit systems, it
is 8 times the number of cores online.
@end deftp
+@deftp Tunable glibc.malloc.tcache_max
+The maximum size of a request (in bytes) which may be met via the
+per-thread cache. The default (and maximum) value is 1032 bytes on
+64-bit systems and 516 bytes on 32-bit systems.
+@end deftp
+
+@deftp Tunable glibc.malloc.tcache_count
+The maximum number of chunks of each size to cache. The default is 7.
+There is no upper limit, other than available system memory. If set
+to zero, the per-thread cache is effectively disabled.
+
+The approximate maximum overhead of the per-thread cache is thus equal
+to the number of bins times the chunk count in each bin times the size
+of each chunk. With defaults, the approximate maximum overhead of the
+per-thread cache is approximately 236 KB on 64-bit systems and 118 KB
+on 32-bit systems.
+@end deftp
+
+@deftp Tunable glibc.malloc.tcache_unsorted_limit
+When the user requests memory and the request cannot be met via the
+per-thread cache, the arenas are used to meet the request. At this
+time, additional chunks will be moved from existing arena lists to
+pre-fill the corresponding cache. While copies from the fastbins,
+smallbins, and regular bins are bounded and predictable due to the bin
+sizes, copies from the unsorted bin are not bounded, and incur
+additional time penalties as they need to be sorted as they're
+scanned. To make scanning the unsorted list more predictable and
+bounded, the user may set this tunable to limit the number of chunks
+that are scanned from the unsorted list while searching for chunks to
+pre-fill the per-thread cache with. The default, or when set to zero,
+is no limit.
+
@node Hardware Capability Tunables
@section Hardware Capability Tunables
@cindex hardware capability tunables