virtual memory, and only the latter matters. I'm not exactly sure why, but
it appears that the kernel modules must be placed above the kernel image.
Just make this comment more ambiguous, in case the next passer-by gets
inspired.
relocating them. The text is allocated as RWX, and then mprotected to RW.
There is a bug that prevents us from doing RW->RX on amd64 and perhaps
sparc64. On x86, the pmap waits for the page to fault before granting it
the X permission. But in the trap handler, such a page is considered as
belonging to kernel_map, while it actually belongs to module_map. The
kernel then finds out the page is not present in kernel_map, and panics.
In all cases, module_map is non pageable, so even if the trap were handled
properly, it still wouldn't work.
Therefore, there is a small window in which the segment is RWX. But that's
fine enough, for now.
When mprotecting a page, the kernel updates the uvm protection associated
with the page, and then gives control to the x86 pmap which splits the
procedure in two: if we are restricting the permissions it updates the page
tree right away, and if we are increasing the permissions it just waits for
the page to fault.
In the first case, it forgets to take care of the X permission. Which means
that if we allocate an executable page, it is impossible to remove the X
permission on it, this being true regardless of whether the mprotect call
comes from the kernel or from userland. It is not possible to make sure the
page is non executable either, since the only holder of the permission
information is uvm, and no track is kept at the pmap level of the actual
permissions enforced. In short, the kernel believes the page is non
executable, while the cpu knows it is.
Fix this by properly taking care of the !VM_PROT_EXECUTE case. Since the
bit manipulation is a little tricky we use two vars: bit_rem (remove) and
bit_put.