which CPU is used as primary, and as a result, secondary CPUs can
require larger number of colors.
In order to solve this problem, update uvmexp.ncolors via
uvm_page_recolor(9) when secondary CPUs are attached, as done for
other ports like x86.
Pointed out by jmcneill@, and discussed on port-arm@:
http://mail-index.netbsd.org/port-arm/2020/07/03/msg006837.html
Tested and OK'd by ryo@.
- My previous change in this file was not correct, kremove does not free
the underlying PA, which caused a very slow leak under memory pressure.
Rework to correctly free the PA.
- Add a second redzone, this time after the stack, to catch several stack
overflows. The main concern is read overflows which leak the heap that
follows the stack.
- UVM_KMF_WAITVA doesn't fail, so remove error check.
- Add KASSERTs.
models with VIPT icache.
Otherwise, alias in virtual address results in inconsistent results,
at least for applications that rewrite text of other process, e.g.,
GDB for arm32.
Also, this hopefully fixes other unexpected failures due to alias.
Confirmed that there's no observable regression in performance;
difference in ``time make -j8'' for GENERIC64 kernel on BCM2837
with and without setting uvmexp.ncolors is within 0.1%.
Thanks to ryo@ for discussion.
the irqack in this case
seems cmdide doesn't honour WDCTL_IDS when executing polled commands,
and there is a race where the irqack call interferes with polled
command and causes a lockup
reported by Martin Husemann
to return the minimum cache line size of the system to userland.
- add CLIDR_EL1 and CTR_EL0 to struct aarch64_sysctl_cpu_id.
On most systems, cache line size is the same for all CPUs, so this mechanism won't be required.
Rather, this is primarily for errata support, which will be committed later.
If CTR_EL0.DIC=1, Icache invalidation is not required.
If CTR_EL0.IDC=1, Dcache clean before Icache invalidation is not required.
CLIDR_EL1.LoC is 0, or CLIDR_EL1.LoUIS and CLIDR_EL1.LoUU are 0, Dcache clean is not required as well.
SEE ALSO ARMARM, "CTR_EL0 Cache Type Register", and "CLIDR_EL1 Cache Level ID Register"
Forgot to consult the AAPCS before committing this before -- oops!
While here, take advantage of the 32 aarch64 simd registers to avoid
all stack spills.
Unlike `.arch_extension crypto', this works with clang; both work
with gas, so we'll go with this.
Clang still can't handle aes_armv8_64.S yet -- it gets confused by
dup and mov on lanes, but this makes progress.
Apparently on arm, .align is actually an alias for .p2align, taking a
power of two rather than a number of bytes, so aes_armv8_64.o was
bloated to 32KB with obscene alignment when it only needed to be
barely past 4KB.
Do the same for the x86 aes_ni_64.S -- even though .align takes a
number of bytes rather than a power of two on x86, let's just stay
away from the temptations of the evil .align directive.
We won't use it on any other systems, and it doesn't build without
NEON anyway. Verified earmv7hf GENERIC, aarch64 GENERIC64, and
earmv6 RPI2 all build with this.