(1) load/store of format D (base + disp)
(2) load/store of format X (base + index)
(3) lmw and stmw
For clang-compiled userland (*), their frequencies are roughly,
(1) > (2) >> (3) ~ 0.
Improvement should be minor; we are already trapped in the alignment
fault handler.
(*) clang unconditionally emits unaligned memory access for powerpc.
Undocumented -disable-ppc-unaligned option does not work...
when nvkm_mem_new_host() is called via the in-kernel ioctl method,
we copy the supplied dmamap, use it's dm_nsegs value for allocation
of "mem->dma", and assume it remains valid until we're done.
when this path is taken "mem->mem" remains NULL so all the code in
nvkm_mem_dtor() is ignored, and the "mem->dma" is leaked. this is
one leak seen in PR#56826. as "dmamap->dm_nsegs" can become invalid
before the dtor call, store the value in "mem->nseg" for use in the
dtor, and convert the dtor to free "mem->dma" if "mem->dma" is set.
additionally, "mem->pages" should end up being the same value as
"nseg" here, ASSERT() this.
while here properly mark NetBSD specific code in nvkm_mem_new_host().
additionally, destroy the dmamap created in the non-ioctl path of
nvkm_mem_new_host(). this is another leak seen in PR#56826.
with both of these fixes my "kmem-04096" pool does not grow rapidly
while using "mpv -vo gpu". infact, once i loaded the relevant file
into memory, this pool remains stable after at least one minute of
video playback.
ok riastradh@
the shared structure. On i386, this cause hypervisor_callback to be
entered before cpu_info_primary is fully initialised, especially on i386
ci_intrstack is still NULL, which cause a crash when we try to use it.
Work around by recycling the boot's tmp stack for this until cpu_attach()
is called.
Patch from chs@. Comment explaining the story by me. This patch may
not be optimal -- maybe it would be better in pthread__init, or
better for rtld to call _lwp_unpark after _lwp_park in the contened
case -- but we've tested this version and it's annoying to reproduce,
so let's take this version and worry about testing improvements
later.
Since the changes this year to eliminate a host of races and
deadlocks in open, close, revoke, attach, and detach, closing the
last instance of a device special node has the side effect of waiting
for all concurrent I/O operations (read, write, ioctl, strategy, &c.)
on the device to complete.
Unfortunately, while this works for physical devices which revoke
open device nodes in their autoconf detach functions, as invoked by
some hardware interrupt indicating that the device is no longer
present, pseudo-devices like vnd(4) work differently -- or, work by
luck, or don't work any more.
VNDIOCCLR acts kind of like an autoconf detach function in that it
revokes open device nodes, which closes the last instance. But
VNDIOCCLR is itself called via ioctl, which is an I/O operation that
close waits for. So we end up with a deadlock, spec_io_drain waiting
for spec_close lower down in the call stack:
> spec_io_drain() at netbsd:spec_io_drain+0x84
> spec_close() at netbsd:spec_close+0x1c6
> VOP_CLOSE() at netbsd:VOP_CLOSE+0x38
> spec_node_revoke() at netbsd:spec_node_revoke+0x14d
> vcache_reclaim() at netbsd:vcache_reclaim+0x4e7
> vgone() at netbsd:vgone+0xcd
> vrevoke() at netbsd:vrevoke+0xfa
> genfs_revoke() at netbsd:genfs_revoke+0x13
> VOP_REVOKE() at netbsd:VOP_REVOKE+0x35
> vdevgone() at netbsd:vdevgone+0x64
> vnddoclear.part.0() at netbsd:vnddoclear.part.0+0xaa
> vndioctl() at netbsd:vndioctl+0x78c
> bdev_ioctl() at netbsd:bdev_ioctl+0x91
> spec_ioctl() at netbsd:spec_ioctl+0xa5
> VOP_IOCTL() at netbsd:VOP_IOCTL+0x41
> vn_ioctl() at netbsd:vn_ioctl+0xb3
> sys_ioctl() at netbsd:sys_ioctl+0x555
In the past, there was a workaround for what was presumably a crash
instead of a deadlock here: don't issue revoke (vdevgone) on the open
character devices for the minor number in use by the ioctl. If you
use, e.g., `vnconfig -u vnd0', and vnconfig(8) picks /dev/rvnd0c or
/dev/rvnd0d, that special case kicks in. But if you use `vnconfig -u
/dev/vnd0d', the ioctl will be issued on the block device instead, so
the special case doesn't kick in, so the operation deadlocks.
It is actually probably safe not to revoke the block device if what
the ioctl caller holds open is that, because specfs(9) forbids more
than one open of a block device, so nothing else can have it open
anyway.
Unclear what the consequences of failing to revoke the character
device are -- but this is what vnd(4) has done all along. cgd(4) and
ccd(4) also don't bother to revoke. We don't have a notion of
`revoke every file descriptor _except_ this one'; only a vnode as a
whole can be revoked, including all references to it.
This is a stop-gap measure to avoid a deadlock we are definitely
hitting on some users. A slightly better measure would be to revoke
the block or character device according to which one is being used,
but that requires a little more work with two different d_ioctl
functions -- and wouldn't address isues with the character device. A
proper solution requires identifying the appropriate protocol for all
of these pseudo-device disk drivers and using it uniformly for them.
Reported on current-users:
https://mail-index.netbsd.org/current-users/2022/05/27/msg042437.html
This was introduced two years ago when the getrandom/getentropy API
question was still open, and removed because the discussion was
ongoing. Now getentropy is more widely adopted and soon to be in
POSIX. So reintroduce the symbol into libc since we'll be keeping it
anyway. Discussion of details of the semantics, as interpreted by
NetBSD, is ongoing, but the symbol needs to get in before the
netbsd-10 branch. The draft POSIX text is
(https://www.opengroup.org/austin/docs/austin_1110.pdf):
SYNOPSIS
#include <unistd.h>
int getentropy(void *buffer, size_t length);
DESCRIPTION
The getentropy() function shall write length bytes of data
starting at the location pointed to by buffer. The output
shall be unpredictable high quality random data, generated by
a cryptographically secure pseudo-random number
generator. The maximum permitted value for the length
argument is given by the {GETENTROPY_MAX} symbolic constant
defined in <limits.h>.
RETURN VALUES
Upon successful completion, getentropy() shall return 0;
otherwise, -1 shall be retunred and errno set to indicate the
error.
ERRORS
The getentropy() function shall fail if:
[EINVAL] The value of length is greater than
{GETENTROPY_MAX}.
The getentropy() function may fail if:
[ENOSYS] The system does not provide the necessary
source of entropy.
RATIONALE
The getentropy() function is not a cancellation point.
Minor changes from the previous introduction of getentropy into libc:
- Return EINVAL, not EIO, on buflen > 256.
- Define GETENTROPY_MAX in limits.h.
The declaration of getentropy in unistd.h and definition of
GETENTROPY_MAX in limits.h are currently conditional on
_NETBSD_SOURCE. When the next revision of POSIX is finalized, we can
expose them also under _POSIX_C_SOURCE > 20yymmL as usual -- and this
can be done as a pullup without breaking existing compiled programs.
Hopefully fixes random hang seen in i386 Xen PV.
The bug has been there ~forever but was masked by the fact that spllower()
did call event handlers much more often.
Found by afl, starting with the malformed input '/**/f=({;/**/};}' that
no longer crashes. This input led to 'f=({L:;}', which is at least a
syntactically valid prefix of a translation unit, containing a GCC
statement expression with an unused label. The error message for this
unused label assumed that it would always be inside a function
definition.
While here, document incomplete recovery after syntax errors, in
msg_249.c.
Instead of running into an assertion failure, the malformed input
'f=({;};}' now generates:
malformed.c(1): error: syntax error ';' [249]
malformed.c(1): warning: ({ }) is a GCC extension [320]
malformed.c(1): warning: ({ }) is a GCC extension [320]
malformed.c(1): error: cannot recover from previous errors [224]
The two synopsis forms differed in the spelling of 'file ...'.
The options string for getopt does not start with ':', which led to a
duplicate message 'unknown option -- ?' followed by 'Unknown flag ?'.
Be more specific when calling 'lint file.c -u'; the message 'Unknown
argument' was not helpful as it didn't pinpoint that there are two
different phases for parsing options. In the second phase, only the
options '-L' and '-l' are recognized.
In the manual page, mention the difference between the two synopsis
forms as early as possible. The two synopsis forms are very similar and
both have far to many options to see the difference at a glance.
The names of the probes correspond to the names shown in vmstat -m.
This should make it much easier to track down who's allocating memory
when there's a leak, e.g. by getting a histogram of stack traces for
the matching kmem cache pool:
# vmstat -m
Memory resource pool statistics
Name Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg Maxpg Idle
...
kmem-00128 256 62242 0 0 3891 0 3891 3891 0 inf 0
...
# dtrace -n 'sdt:kmem:*:kmem-00128 { @[probefunc, stack()] = count() }'
^C
When there's no leak, the allocs and frees (probefunc) will be roughly
matched; when there's a leak, the allocs will far outnumber the frees.