lock, because we are going to trigger a KASSERT. Also hold the lock
longer and take the proc lock for kpsignal(). Maybe we should add
mutex_steal() and mutex_return() for the debugger? Lock correction
suggestion from jmcneill.
sys/stdarg.h and expect compiler to provide proper builtins, defaulting
to the GCC interface. lint still has a special fallback.
Reduce abuse of _BSD_VA_LIST_ by defining __va_list by default and
derive va_list as required by standards.
- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.
- Simplify locking in some pmap(9) modules by removing P->V locking.
- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).
- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.
- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.
Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
rename "UVMHIST" option to enable the uvm histories.
TODO:
- make UVMHIST properly depend upon KERNHIST
- enable dynamic registration of histories. this is mostly just
allocating something in a bitmap, and is only for viewing multiple
histories in a merged form.
tested on amd64 and sparc64.
in db_sym.[ch] as it is used by the elf version of crash(8).
i will be cleaning up the db_sym.c code in a follow up commit to avoid
having dead code compiled.
usage: show proc [/a] [/p] address|pid
/a == argument is an address of any lwp
/p == argument is a pid [default]
From: Vladimir Kirillov proger at wilab dot org dot ua
address of the head, not to the head itself. Not sure if the cast of the
arg to db_value_name() is right, but works on i386 and compiles on sparc64.
Stack traces from ddb work again on i386.
DDB is flakey. The command history wanders past the bounds. Way
past. When it hits some boolean that indicates a.out format symbol
tables are to be used, and here is the pointer to the function, the
call thru the NULL function pointer renders the debug session entirely
unsatisfactory, outcome wise.
1.20).
The problem was reported Ty Sarna and Geoff Wing. Thanks.
XXX: I made #if too messy. split db_access.c into two files for
db_{get,put}_values and db_read_{int,ptr}?
- Avoid atomics in more places.
- Remove the per-descriptor mutex, and just use filedesc_t::fd_lock.
It was only being used to synchronize close, and in any case we needed
to take fd_lock to free the descriptor slot.
- Optimize certain paths for the <NDFDFILE case.
- Sprinkle more comments and assertions.
- Cache more stuff in filedesc_t.
- Fix numerous minor bugs spotted along the way.
- Restructure how the open files array is maintained, for clarity and so
that we can eliminate the membar_consumer() call in fd_getfile(). This is
mostly syntactic sugar; the main functional change is that fd_nfiles now
lives alongside the open file array.
Some measurements with libmicro:
- simple file syscalls are like close() are between 1 to 10% faster.
- some nice improvements, e.g. poll(1000) which is ~50% faster.
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)
ddb.onpanic to 1, change it back to 0 in sysctl.conf and make sure
postinstall installs this setting.
This avoids us trying to dump while booting from install CD, but keeps
the default the same once we are far enough through /etc/rc.d. Failing
earlier is unlikely to be recovered by an automatic reboot.
OK: core.
types of changes:
- Add a few new methods to replace stuff like p_find(), CPU_INFO_FOREACH.
- Use db_read_bytes() instead of accessing kernel structures directly,
and similar changes.
- Add ifdef _KERNEL where the above hasn't been done, and an XXX comment.
- reimplement vmem sanity checks with less code duplication.
- reimplement ddb vmem-related commands in a more consistent ways.
remove automatic whatis.
consumption of users external to the project, users who are unlikely to
be kernel hackers with the motivation to debug crashes. In this situation
rebooting and creating a crash dump is more appropriate than interrupting
normal service for an unbounded amount of time, while also leaving the
machine at cryptic db> prompt.
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.
- When registering command tables, make sure the builtin commands are
already registered
- Make the command table entry structure private
- Do not bother to store the number of commands in a table, we can quickly
calc that if needed.
from doc/BRANCHES:
idle lwp, and some changes depending on it.
1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.
Seems to be quite stable. Some work still left to do.
Please note, that syscalls are not yet MP-safe, because
of the file and vnode subsystems.
Reviewed by: <tech-kern>, <ad>
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)
broken. Inside the kernel, we always have to use the real values of the
st_name fields, and only do the math when the request comes from userland.
No need for ksyms_getval_from{kernel,userland} hack anymore. However, a
different version will be asked for pull-up in -2{,-0}, one that doesn't
break the API, that is.
Fixes PR#29133 from Jens Kessmeier.
kernel message buffer/log. Its off by default and can be switched on in the
kernel configuration on build time, be set as a variable in ddb and be set
using sysctl.
This adds the sysctl value
ddb.tee_msgbuf = 0
by default.
The functionality is especially added and aimed for developers who are not
blessed with a serial console and wish to keep all their ddb output in the
log. Specifying /l as a modifier to some selected commands will also put
the output in the log but not all commands provide one nor has the same
meaning for all commands.
This feature could in the future also be implemented as an ddb command but
that could lead to more bloat allthough maybe easier for non developpers to
use when mailing their backtraces from kernel crashes.
Gone are the old kern_sysctl(), cpu_sysctl(), hw_sysctl(),
vfs_sysctl(), etc, routines, along with sysctl_int() et al. Now all
nodes are registered with the tree, and nodes can be added (or
removed) easily, and I/O to and from the tree is handled generically.
Since the nodes are registered with the tree, the mapping from name to
number (and back again) can now be discovered, instead of having to be
hard coded. Adding new nodes to the tree is likewise much simpler --
the new infrastructure handles almost all the work for simple types,
and just about anything else can be done with a small helper function.
All existing nodes are where they were before (numerically speaking),
so all existing consumers of sysctl information should notice no
difference.
PS - I'm sorry, but there's a distinct lack of documentation at the
moment. I'm working on sysctl(3/8/9) right now, and I promise to
watch out for buses.
symbols, and made it impossible for the kernel to use that value, and
correctly find symbols from LKMs.
o Allow LKM users to use DDB to debug the entry function of a LKM by
loading the symbol table with the temporary name /lkmtemp/ before calling
it, and then renaming it once we know the module name.
Approved by ragge@.
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.
Bump the kernel rev up to 1.6V
to select the right format string (at compile time) when displaying
variables of type db_expr_t.
This fixes a problem where ddb(4) would only display the low 32-bits
of registers for an ILP32 kernel on SH5, even though registers
(and db_expr_t) are always 64-bits wide.
message buffer has not yet been set up, mimicking code from the top of
the sysctl routine for retrieving the message buffer.
(2) Add a /l modifier to the trace command. This makes it print the
backtrace using printf() instead of db_printf(), which has the nice
side-effect of also putting it into the message buffer. A kernel with
ddb in it but disabled (ie, ddb.onpanic set to zero) will print a
backtrace (which ends up in the message buffer) before dumping (or
not, depending on the value of kern.dump_on_panic) and rebooting, but
if ddb is not disabled, the backtrace is not printed, and there's no
way to get it to display a backtrace such that you can retrieve it
after the dump. The backtrace printed by gdb is sometimes a little
different.
(3) Documentation for the above.
breakpoint address before it's used. Currently a no-op on all but sh5.
This is useful on sh5, for example, to mask off the instruction
type encoding in the bottom two address bits, and makes it possible
to do "db> break $rXX" instead of manually munging the address.
done by Artur Grabowski and Thomas Nordin for OpenBSD, which is more
efficient in several ways than the callwheel implementation that it is
replacing. It has been adapted to our pre-existing callout API, and
also provides the slightly more efficient (and much more intuitive)
API (adapted to the callout_*() naming scheme) that the OpenBSD version
provides.
Among other things, this shaves a bunch of cycles off rescheduling-in-
the-future a callout which is already scheduled, which the common case
for TCP timers (notably REXMT and KEEP).
The API has been simplified a bit, as well. The (very confusing to
a good many people) "ACTIVE" state for callouts has gone away. There
is now only "PENDING" (scheduled to fire in the future) and "EXPIRED"
(has fired, and the function called).
Kernel version bump not done; we'll ride the 1.6N bump that happened
with the malloc(9) change.
Add support for cpus where sizeof(register_t) is not necessarily
the same as sizeof(void *). This is the case on SH5 using the
ILP32 ABI. On this cpu db_expr_t is, necessarily, 64-bits.
Unfortunately, in ILP32 mode, ddb will only display the low 32-bits
of any expression, including registers...
Make some variables and functions static when not used outside of a module.
Make variables in headers extern.
Delete the unused db_find_watchpoint() function.
writes to a string rather than outputs to the supplied printer.
This is convenient for disassemblers that are structured to
build a long string and print it later.
Perhaps db_printsym() should be changed to use this interface...
- replace opt_kgdb_machdep.h with opt_kgdb.h
- defparam opt_kgdb.h:
KGDB_DEV KGDB_DEVNAME KGDB_DEVADDR KGDB_DEVRATE KGDB_DEVMODE
- move from opt_ddbparam.h to opt_ddb.h:
DDB_FROMCONSOLE DDB_ONPANIC DDB_HISTORY_SIZE DDB_BREAK_CHAR SYMTAB_SPACE
- replace KGDBDEV with KGDB_DEV
- replace KGDBADDR with KGDB_DEVADDR
- replace KGDBMODE with KGDB_DEVMODE
- replace KGDBRATE with KGDB_DEVRATE
- use `9600' instead of `0x2580' for 9600 baud rate
- use correct quotes for options KGDB_DEVNAME="\"com\""
- use correct quotes for options KGDB_DEV="17*256+0"
- remove unnecessary dependancy on Makefile for kgdb_stub.o
- minor whitespace cleanup
"earliest" firing callout in a bucket. This allows us to skip
the scan up the bucket if no callouts are due in the bucket.
A cheap O(1) hint update is done at callout insertion (if new callout
is earlier than hint) and removal (is bucket empty). A thorough
refresh of the hint is done when the bucket is traversed.
This doesn't matter much on machines with small values of hz
(e.g. i386), but on systems with large values of hz (e.g. Alpha),
it has a definite positive effect.
Also, keep the callwheel stats in evcnts, so that you can view them
with "vmstat -e".
This will allow improvements to the pmaps so that they can more easily defer expensive operations, eg tlb/cache flush, til the last possible moment.
Currently this is a no-op on most platforms, so they should see no difference.
Reviewed by Jason.
in order to load the symbol table. Instead of using the sections
called ".symtab" and ".strtab", use the first SYMTAB section (the
ELF spec says there should currently only be one) and the STRTAB
section that's linked to it. I believe this is more robust, and it
certainly makes life easier for the bootloader.