more work but is enough to load simple LKMs. amd64 is untested.
Locking is caller provided. This is decoupled from the LKM framework because
kernel modules need not be loaded from the file system - they could be built
into the kernel and referenced via link set.