This avoids unnecessarily copying the canary when doing a realloc from a
small size to a large size. It also avoids trying to copy a non-existent
canary out of a zero-size allocation, which are memory protected.
This avoids making a huge number of getrandom system calls during
initialization. The init CSPRNG is unmapped before initialization
finishes and these are still reseeded from the OS. The purpose of the
independent CSPRNGs is simply to avoid the massive performance hit of
synchronization and there's no harm in doing it this way.
Keeping around the init CSPRNG and reseeding from it would defeat the
purpose of reseeding, and it isn't a measurable performance issue since
it can just be tuned to reseed less often.
The tiny performance cost might as well be accepted now because this
will be needed for Android Q. It's also quite possible that some apps
make use of the features based on this including malloc_info.
The initial implementation was a temporary hack rather than a serious
implementation of random arena selection. It may still make sense to
offer it but it should be implemented via the CSPRNG instead of this
silly hack. It would also make sense to offer dynamic load balancing,
particularly with sched_getcpu().
This results in a much more predictable spread across arenas. This is
one place where randomization probably isn't a great idea because it
makes the benefits of arenas unpredictable in programs not creating a
massive number of threads. The security benefits of randomization for
this are also quite small. It's not certain that randomization is even a
net win for security since it's not random enough and can result in a
more interesting mix of threads in the same arena for an attacker if
they're able to attempt multiple attacks.
This wouldn't be worth using even if it had a whole bunch of heuristics
like ignoring expressions in static_assert, ignoring repeated patterns
like assigning different things to sequential array indexes, etc.