Commit Graph

1137 Commits

Author SHA1 Message Date
cegger
ef058ada1e merge. (forgot in previous) 2011-12-07 15:40:15 +00:00
cegger
4d7ec3b06b merge. 2011-12-07 15:04:18 +00:00
cegger
230bc32186 Welcome to Xen 4.1.2 headers.
New interfaces for PV drivers:
- Xen transcedent memory
- USB IO
- SCSI IO

PCI IO improvements:
- PCI MSI support
- PCI Express AER support

New features:
- xen honors flags to be placed into guest kernel available pte bits
  if enabled (for grant table)
- support for 128 vcpus
  (old interface is still present and supports up to 32 vcpus)
- PCI passthrough: new hypercalls to support SR-IOV
- new hypercall for physical cpu hotplugging
- new hypercall for physical page offlining
- fixes to compile with clang
- machine check recovery mechanism
2011-12-07 14:41:15 +00:00
cegger
bae54e0656 build fix: add back <sys/malloc.h>. malloc(9) is still in use. 2011-12-07 13:49:04 +00:00
cegger
e3626448aa merge. The 'conflicts' happened because xen-public was once used for xen2 headers. 2011-12-07 13:24:04 +00:00
cegger
25dd519694 re-import xen3-public to rename this to xen-public 2011-12-07 13:15:44 +00:00
cherry
6b3b571c64 Move to kmem_zalloc() instead of malloc(). 2011-12-07 12:31:51 +00:00
cherry
848746d6bc [merging from cherry-xenmp]
Make MP aware: use mutex(9) instead of spl(9)
2011-12-04 15:15:41 +00:00
bouyer
ad7affb170 hypervisor_unmask_event(): don't check/update evtchn_pending_sel for the
current CPU, but for any CPU which may accept this event.
xen/xenevt.c: more use of atomic ops and locks where appropriate, and some
  other SMP fixes. Handle all events on the primary CPU (may be revisited
  later). Set/clear ci_evtmask[] for watched events.

This should fix the problems on dom0 kernels reported by jym@
2011-12-03 22:41:40 +00:00
bouyer
4d61ee8d61 xbdback_disconnect() can be called twice, from XenbusStateClosing then from
xbdback_xenbus_destroy(). The second call will wait forever as the first
already caused the xbd thread to exit.
Have xbdback_disconnect() check if we're already disconnected and if so,
do nothing.
2011-12-03 22:36:28 +00:00
joerg
f5c3f346ee Don't use variables as format string. 2011-11-24 18:34:56 +00:00
jym
54f95b1441 Deep rework of the xbdback(4) driver; it now uses a thread per instance
instead of continuations directly from shm callbacks or interrupt
handlers. The whole CPS design remains but is adapted to cope with
a thread model.

This patch allows scheduling away I/O requests of domains that behave
abnormally, or even destroy them if there is a need to (without thrashing
dom0 with lots of error messages at IPL_BIO).

I took this opportunity to make the driver MPSAFE, so multiple instances
can run concurrently. Moved from home-grown pool(9) queues to
pool_cache(9), and rework the callback mechanism so that it delegates
I/O processing to thread instead of handling it itself through the
continuation trampoline.

This one fixes the potential DoS many have seen in a dom0 when trying to
suspend a NetBSD domU with a corrupted I/O ring.

Benchmarks (build.sh release runs and bonnie++) do not show any
performance regression, the "new" driver is on-par with the "old" one.

ok bouyer@.
2011-11-24 01:47:18 +00:00
jym
1eaed4e6e6 Move Xen-specific functions to Xen pmap. Requested by cherry@.
Un'ifdef XEN in xen_pmap.c, it is always defined there.
2011-11-23 00:56:56 +00:00
jym
6bfeabc65a Expose pmap_pdp_cache publicly to x86/xen pmap. Provide suspend/resume
callbacks for Xen pmap.

Turn static internal callbacks of pmap_pdp_cache.

XXX the implementation of pool_cache_invalidate(9) is still wrong, and
IMHO this needs fixing before -6. See
http://mail-index.netbsd.org/tech-kern/2011/11/18/msg011924.html
2011-11-20 19:41:27 +00:00
tls
3afd44cf08 First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>.  This change includes
the following:

	An initial cleanup and minor reorganization of the entropy pool
	code in sys/dev/rnd.c and sys/dev/rndpool.c.  Several bugs are
	fixed.  Some effort is made to accumulate entropy more quickly at
	boot time.

	A generic interface, "rndsink", is added, for stream generators to
	request that they be re-keyed with good quality entropy from the pool
	as soon as it is available.

	The arc4random()/arc4randbytes() implementation in libkern is
	adjusted to use the rndsink interface for rekeying, which helps
	address the problem of low-quality keys at boot time.

	An implementation of the FIPS 140-2 statistical tests for random
	number generator quality is provided (libkern/rngtest.c).  This
	is based on Greg Rose's implementation from Qualcomm.

	A new random stream generator, nist_ctr_drbg, is provided.  It is
	based on an implementation of the NIST SP800-90 CTR_DRBG by
	Henric Jungheim.  This generator users AES in a modified counter
	mode to generate a backtracking-resistant random stream.

	An abstraction layer, "cprng", is provided for in-kernel consumers
	of randomness.  The arc4random/arc4randbytes API is deprecated for
	in-kernel use.  It is replaced by "cprng_strong".  The current
	cprng_fast implementation wraps the existing arc4random
	implementation.  The current cprng_strong implementation wraps the
	new CTR_DRBG implementation.  Both interfaces are rekeyed from
	the entropy pool automatically at intervals justifiable from best
	current cryptographic practice.

	In some quick tests, cprng_fast() is about the same speed as
	the old arc4randbytes(), and cprng_strong() is about 20% faster
	than rnd_extract_data().  Performance is expected to improve.

	The AES code in src/crypto/rijndael is no longer an optional
	kernel component, as it is required by cprng_strong, which is
	not an optional kernel component.

	The entropy pool output is subjected to the rngtest tests at
	startup time; if it fails, the system will reboot.  There is
	approximately a 3/10000 chance of a false positive from these
	tests.  Entropy pool _input_ from hardware random numbers is
	subjected to the rngtest tests at attach time, as well as the
	FIPS continuous-output test, to detect bad or stuck hardware
	RNGs; if any are detected, they are detached, but the system
	continues to run.

	A problem with rndctl(8) is fixed -- datastructures with
	pointers in arrays are no longer passed to userspace (this
	was not a security problem, but rather a major issue for
	compat32).  A new kernel will require a new rndctl.

	The sysctl kern.arandom() and kern.urandom() nodes are hooked
	up to the new generators, but the /dev/*random pseudodevices
	are not, yet.

	Manual pages for the new kernel interfaces are forthcoming.
2011-11-19 22:51:18 +00:00
cherry
de4e5fae37 [merging from cherry-xenmp] bring in bouyer@'s changes via:
http://mail-index.netbsd.org/source-changes/2011/10/22/msg028271.html
From the Log:
Log Message:
Various interrupt fixes, mainly:
keep a per-cpu mask of enabled events, and use it to get pending events.
A cpu-specific event (all of them at this time) should not be ever masked
by another CPU, because it may prevent the target CPU from seeing it
(the clock events all fires at once for example).
2011-11-19 17:13:39 +00:00
jmcneill
37ffe0c4a8 remove Xbox support 2011-11-18 22:18:07 +00:00
cherry
92f0f13b6c [merging from cherry-xenmp]
- Make clock MP aware.
 - Bring in fixes that bouyer@ brought in via:
   cvs rdiff -u -r1.54.6.4 -r1.54.6.5 src/sys/arch/xen/xen/clock.c

Thanks to riz@ for testing on dom0
2011-11-18 06:01:50 +00:00
christos
ac2d876c25 Use getdiskinfo() to print the name of the device; the previous code
constructed the wrong name if it was a wedge.
2011-11-14 21:34:50 +00:00
hannken
14767c9d30 Bring back sys/disklabel.h for DISKUNIT and DISKPART. 2011-11-14 16:04:29 +00:00
christos
96b6da2490 use getdiskinfo() 2011-11-13 23:02:06 +00:00
cherry
3520926365 Expose the PG_k #define pt/pd bit to both xen and "baremetal" x86. This is required, since kernel pages are mapped with user permissions in XEN/amd64 since the VM kernel runs in ring3. Since XEN/i386(including PAE) runs in ring1, supervisor mode is appropriate for these ports. We need to share this since the pmap implementation is still shared. Once the xen implementation is sufficiently independant of the x86 one, this can be made private to xen/include/xenpmap.h 2011-11-08 17:16:52 +00:00
cherry
926a93384f Add an ipi callback to force hypervisor callback. this is useful to "re-route" interrupts to a given vcpu 2011-11-07 15:51:31 +00:00
cherry
c9745c1f66 [merging from cherry-xenmp] make pmap_kernel() shadow PMD per-cpu and MP aware. 2011-11-06 15:18:18 +00:00
cherry
396b8b4abf [merging from cherry-xenmp] Make the xen MMU op queue locking api private. Implement per-cpu queues. 2011-11-06 11:40:46 +00:00
bouyer
54c647c0b1 Fix bogus KASSERT: if there is a xbdi_io, xbdi_pendingreqs must *NOT* be 0.
Not sure why it has stayed unoticed for so long ...
2011-10-25 17:25:47 +00:00
jym
cf4b804efe Move disconnection code to a separate function, similar to what is done
with xbdback_connect.
2011-10-24 18:13:50 +00:00
jruoho
e23dd3f620 Remove code that is commented out and out-of-sync with x86. If Xen needs to
use cpu_resume(), cpu_suspend(), or cpu_shutdown() in the future, it is
better to expose these from x86 rather than duplicate code.
2011-10-20 13:21:11 +00:00
jym
2c4b0fd95e Move Xen specific functions out of x86 native pmap to xen_pmap.c.
Provide a wrapper to trigger pmap pool_cache(9) invalidations without
exposing the caches to outside world.
2011-10-18 23:43:06 +00:00
mrg
8f93e1bd21 remove a check against uvmexp.ncolors that is done inside uvm_page_recolor()
already anyway.
2011-10-06 06:56:29 +00:00
jruoho
7feffa2641 Call cpufreq_suspend(9) and cpufreq_resume(9) during suspend/resume. 2011-09-28 15:38:21 +00:00
jym
325494fe33 Modify *ASSERTMSG() so they are now used as variadic macros. The main goal
is to provide routines that do as KASSERT(9) says: append a message
to the panic format string when the assertion triggers, with optional
arguments.

Fix call sites to reflect the new definition.

Discussed on tech-kern@. See
http://mail-index.netbsd.org/tech-kern/2011/09/07/msg011427.html
2011-09-27 01:02:33 +00:00
jym
bfc65ee0bf Fix a fallout with my xensuspend merge: talk_to_backend() returns a
boolean, so checking for "true" with "== 0" is... wrong.

Now xennet(4) should work as expected, and not stay in the InitWait state
(which blocks network communication with the backend).

Thanks to riz@ and sborrill@ for reporting breakage with -current
xennet(4) after my merge.
2011-09-26 21:44:09 +00:00
jym
afeabb041e Expose Xen kernfs entries inside a domU. Patch originally from sborrill@,
slightly modified by me to profit from runtime checks for dom0 privileges
instead of using compile time macros (DOM0OPS).

It should now be possible to use pkgsrc's sysutils/xentools inside
a domU to query XenStore entries (or even modify part of it if the domain
has enough rights).
2011-09-22 23:02:34 +00:00
cegger
790b04f998 Initialize mutex before use. Lets me boot a dom0 kernel again
without a lockdebug panic.
2011-09-21 15:26:47 +00:00
jym
eba16022d3 Merge jym-xensuspend branch in -current. ok bouyer@.
Goal: save/restore support in NetBSD domUs, for i386, i386 PAE and amd64.

Executive summary:
- split all Xen drivers (xenbus(4), grant tables, xbd(4), xennet(4))
in two parts: suspend and resume, and hook them to pmf(9).
- modify pmap so that Xen hypervisor does not cry out loud in case
it finds "unexpected" recursive memory mappings
- provide a sysctl(7), machdep.xen.suspend, to command suspend from
userland via powerd(8). Note: a suspend can only be handled correctly
when dom0 requested it, so provide a mechanism that will prevent
kernel to blindly validate user's commands

The code is still in experimental state, use at your own risk: restore
can corrupt backend communications rings; this can completely thrash
dom0 as it will loop at a high interrupt level trying to honor
all domU requests.

XXX PAE suspend does not work in amd64 currently, due to (yet again!)
page validation issues with hypervisor. Will fix.

XXX secondary CPUs are not suspended, I will write the handlers
in sync with cherry's Xen MP work.

Tested under i386 and amd64, bear in mind ring corruption though.

No build break expected, GENERICs and XEN* kernels should be fine.
./build.sh distribution still running. In any case: sorry if it does
break for you, contact me directly for reports.
2011-09-20 00:12:23 +00:00
dyoung
78b0e18345 Report vmem(9) errors out-of-band so that we can use vmem(9) to manage
ranges that include the least and the greatest vmem_addr_t.  Update
vmem(9) uses throughout the kernel.  Slightly expand on the tests in
subr_vmem.c, which still pass.  I've been running a kernel with this
patch without any trouble.
2011-09-02 22:25:08 +00:00
christos
05ec717ee7 Add bus_dma overrides. From dyoung 2011-09-01 15:10:31 +00:00
jym
cb1f14140c VIRQ_TIMER virqs are allocated and tracked in a array
(virq_timer_to_evtch, indexed by cpuid) different from the
VIRQ <> event channel one (virq_to_evtch, indexed by event channel ID).

This is fine: fix a "harmless" bug that resulted in the event
channel of VIRQ_TIMER getting lost during bind as it was not stored
in the proper array.

"Harmless" because it is not critical for -current, however in the Xen
save/restore branch this completely cripples restore. Xen clock gets
suspended, but never comes back (fetched channel ID being invalid). Oops.

Add a small comment so we can better see the "get => allocate? => set"
chain of actions when binding/unbinding event channels.
2011-08-28 22:55:52 +00:00
jym
4128291e47 KNF, white spaces and comment typo fixes. 2011-08-28 22:36:17 +00:00
christos
93e326680f use c99 struct initializers 2011-08-27 09:32:11 +00:00
jym
b1c4de01e1 Protect xbdback(4) ring indexes from overflowing; leave the continuation
prematurely in case they do, to avoid looping "endlessly" (or at least
a very long time) at IPL_BIO while trying to handle requests.

This should not happen in a nominal scenario, but the ring can get
corrupted for whatever reason (memory errors, domU failures or
exploitation).
2011-08-24 20:49:34 +00:00
jym
e298d4d6a8 Merge err printf with the panic(9) message.
Also fix the if () {...} statement with braces, to avoid calling panic()
every time. Hi cherry!
2011-08-21 10:00:13 +00:00
joerg
a99d375170 Works with clang's integrated assembler now. 2011-08-17 21:42:16 +00:00
dholland
595e2ecd73 Fix broken build. 2011-08-16 02:59:16 +00:00
cherry
ce14cd73f0 invert buggy ci_flag test 2011-08-15 20:17:12 +00:00
cherry
c04a001592 Do not panic() on xen_send_ipi() sent to a cpu not yet running.
x86 MP boot depends on this strange behaviour.
2011-08-15 20:14:52 +00:00
cherry
a6365cd724 Call the right function
(fix for an egregious error)
2011-08-13 20:24:19 +00:00
cherry
ac71905311 Use spin mutices correctly.
- Prune redundant splxx()/splx() pairs.
 - Do not "leak" a mutex_spin_enter() via conditional return.

Thanks rmind@
2011-08-13 17:23:42 +00:00
cherry
e37867bdb5 Remove spurious header.
Thanks rmind@
2011-08-13 16:22:15 +00:00
cherry
7bd1f7e3fe MP probing and startup code 2011-08-13 12:37:30 +00:00
cherry
92ccc4ea78 Add locking around ops to the hypervisor MMU "queue". 2011-08-13 12:09:38 +00:00
cherry
787e5fb097 remove unnecessary locking overhead for UP 2011-08-13 11:41:57 +00:00
cherry
3c3a6a3a8e Hide the MD details of specific IPIs behind semantically pleasing functions. This cleans up a couple of #ifdef XEN/#endif pairs 2011-08-11 18:11:17 +00:00
cherry
3ccb0add41 Make event/interrupt handling MP aware 2011-08-11 17:58:59 +00:00
cherry
155048478e refactor the bitstring/mask operations to be behind an API. Make pending interrupt marking cpu aware. 2011-08-10 21:46:02 +00:00
cherry
941e03e900 KNF police (rmind@ :-) 2011-08-10 20:38:45 +00:00
cherry
d7b11fa417 xen ipi infrastructure 2011-08-10 11:39:44 +00:00
cherry
1f0a8a809d Introduce locking primitives for Xen pte operations, and xen helper calls for MP related MMU ops 2011-08-10 09:50:37 +00:00
cherry
8d4cb7a73d Add Xen specific ipi bitmasks 2011-08-10 06:29:23 +00:00
bouyer
0eab8d73aa Guard against spurious xbdback_backend_changed() calls which would result
in the block device being opened twice. Fixes port-xen/45158,
although the underlying cause (multiple open of the same device not
properly handled any more) is not fixed.
2011-08-07 17:39:34 +00:00
bouyer
d6cda51db2 Add a comment explaing why a flush workqueue is handled differently from
read/write workqueue requests.
2011-08-07 17:15:40 +00:00
bouyer
48ed379bb7 Several fixes to the continuation engine:
- make sure to enter the continuation loop at splbio(), and add some
  KASSERT() for this.
- When a flush operation is enqueued to the workqueue, make sure the
  continuation loop can't be restarted by a previous workqueue
  completion or an event. We can't restart it at this point because
  the flush even is still recorded as the current I/O.
  For this add a xbdback_co_cache_doflush_wait() which acts as a noop;
  the workqueue callback will restart the loop once the flush is complete.
Should fix "kernel diagnostic assertion xbd_io->xio_mapped == 0" panics
reported by Jeff Rizzo on port-xen@.
2011-08-07 17:10:35 +00:00
bouyer
0ef67b2899 Make sure to call xbdback_trampoline() at splbio() 2011-08-04 18:01:49 +00:00
jym
5b7c93647d Fix typo in comment. 2011-07-31 18:00:54 +00:00
jym
a1508a4756 Move xen.balloon to machdep in the sysctl(7) tree. It does not really
belong to either kern or hw.

Rename machdep.xen_timepush_ticks to xen.timepush_ticks, so it can live
under the same tree as the balloon node, machdep.xen.

ok bouyer@.
2011-07-29 22:16:05 +00:00
matt
6ed2595d7e Change a cast to appease gcc4.5 2011-07-27 23:11:23 +00:00
matt
1df400bdea Make this use offsetof and __typeof__ to appease gcc4.5 2011-07-27 23:10:40 +00:00
jym
7e80d41a91 And... explain xbd(4). 2011-07-25 00:06:49 +00:00
jym
6214043369 KNF. No functional change. 2011-07-25 00:02:38 +00:00
jym
77822551fa Add more comments to xbdback(4) code. These make the continuations a bit
easier to follow (and understand). Helped tracking down a regression
between save/restore xbdback(4) states.

A few minor fixes, which are merely cosmetic:
- call graph is (somewhat) more readable
- rework the xbdback_do_io routine with a switch statement, so as to
trigger a panic() in case an invalid operation passed through the sanity
checks. panic might be overkill here, but I am sure to catch errrors in
case it happens.
2011-07-24 23:56:34 +00:00
joerg
3eb244d801 Retire varargs.h support. Move machine/stdarg.h logic into MI
sys/stdarg.h and expect compiler to provide proper builtins, defaulting
to the GCC interface. lint still has a special fallback.
Reduce abuse of _BSD_VA_LIST_ by defining __va_list by default and
derive va_list as required by standards.
2011-07-17 20:54:30 +00:00
rmind
3127d2afb9 Initialise cpus_running to 1 on Xen, as it was done on x86.
Problem analysed by hannken@.  Fixes PR/45062.
2011-07-16 14:46:18 +00:00
jym
6d90e19e1d Remove all return error checks for event_set_handler(...). It either
succeeds or end in panic.
2011-07-02 19:07:56 +00:00
dyoung
e265f67bc1 #include <sys/bus.h> instead of <machine/bus.h>. 2011-07-01 18:31:32 +00:00
wiz
4cbd24b23f dependant -> dependent 2011-06-30 20:09:15 +00:00
rmind
06b5aba5f8 Few XEN fixes:
- cpu_load_pmap: perform tlbflush() after xen_set_user_pgd().
- xen_pmap_bootstrap: perform xpq_queue_tlb_flush() in the end.
- pmap_tlb_shootdown: do not check PG_G for Xen.
2011-06-15 20:50:02 +00:00
rmind
57f2d9bddc - cpu_hatch: call tlbflushg(), just to make sure that TLB is clean.
- xen_bootstrap_tables: call xpq_queue_tlb_flush() for safety.
- Initialise cpus_attached and ci_cpumask for primary CPU.
2011-06-15 19:54:16 +00:00
rmind
155f2284da - privpgop_fault: call pmap_update() before uvmfault_unlockall().
- privcmd_ioctl, xengnt_more_entries: add missing pmap_update().
2011-06-15 19:51:50 +00:00
pgoyette
813683b4ac Include required file for xen acpi 2011-06-13 00:53:15 +00:00
jruoho
4e4705978b Fix build failure for the odd child, as pointed out by pgoyette@. 2011-06-12 16:31:57 +00:00
rmind
e225b7bd09 Welcome to 5.99.53! Merge rmind-uvmplock branch:
- Reorganize locking in UVM and provide extra serialisation for pmap(9).
  New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
  the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
  Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
  kernel-lock on some ports).  Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.
2011-06-12 03:35:36 +00:00
jym
422a28e00e Be more consistent for event handler naming with block backend: it is
xbdback(4) rather than xbd(4), and use i for identifier separation
(like xvif(4)).

The name is not used outside from event counters (vmstat -i), so
should be transparent to Xen block scripts.
2011-06-07 16:41:14 +00:00
bouyer
44635a6831 Don't call psignal() without holding proc_lock. This is the cause of
the reboot of PR port-xen/45028
Now that Xen2 is gone, handle FPU context switches the same way as
amd64. This makes all tests in /usr/tests/lib/libc/ieeefp pass.
2011-06-07 14:53:03 +00:00
bouyer
1d34d759fe check that the list is empty before calling cv_wait(). Otherwise
we may sleep waiting for an event which is already in the queue.
2011-06-07 13:52:30 +00:00
joerg
f6ced94e3b Disable -Werror for ah_regdomain.c if building with clang as workaround
for http://llvm.org/bugs/show_bug.cgi?id=10030.
2011-05-30 15:06:32 +00:00
joerg
1b1e18e947 Use format string for the device name 2011-05-30 14:34:58 +00:00
joerg
4ef16e9c8a Use proper format string 2011-05-30 13:03:56 +00:00
jym
31756f06ef Split KASSERT(... && ...) in two, so it's easier to spot which one
fired with DIAGNOSTIC.
2011-05-26 22:18:13 +00:00
jym
b15cc24d92 Reuse the pointer to the request operation, as set above. 2011-05-26 22:16:42 +00:00
rmind
5df9d86377 - Replace uses of simple_lock and ltsleep with mutex and condvar.
- Improve some parts of the code to be more MP-friendly.

Tested by jakllsch@.
2011-05-22 04:27:15 +00:00
jym
99c9ae6dfa In xbdback(4), move the code that copies segments after the bound checks
of the ``nr_segments'' variable.

In cases where we are running domUs with an architecture different from the
dom0 one (for example: 32 bits domUs on 64 bits dom0), copying segments
with an invalid nr_segments value will lead to the corruption of the
xbdback instance structure and quickly crash the dom0 backend.

Tested under 64 bits dom0 with 32 bits domUs. No regression observed.

ok bouyer@.

Will be pulled up to -4 and -5.
2011-05-21 15:22:49 +00:00
joerg
316b3ac0de LLVM's assembler parser doesn't support .code32 yet, so disable it as
needed.
2011-05-20 13:11:40 +00:00
dyoung
a6b2b8396b PCI_FLAGS_IO_ENABLED and PCI_FLAGS_MEM_ENABLED changed their functional
role in NetBSD (drivers are no longer supposed to write these to
pa_flags) without changing name.  Correct that.

Rename PCI_FLAGS_IO_ENABLED to PCI_FLAGS_IO_OKAY and
PCI_FLAGS_MEM_ENABLED to PCI_FLAGS_MEM_OKAY, thus making their names
consistent with the other PCI flags and poisoning 3rd-party driver
sources that use the flags in the old bad way.

This patch produces no binary changes in this set of PCI kernels when
they are compiled w/o 'options DIAGNOSTIC' and w/ -V MKREPRO=yes:

algor P4032 P5064 P6032
alpha GENERIC
amd64 GENERIC XEN3_DOM0
arc GENERIC
atari HADES MILAN-PCIIDE
bebox GENERIC
cats GENERIC
cobalt GENERIC
evbarm-el ADI_BRH ARMADILLO9 CP3100 GEMINI GEMINI_MASTER GEMINI_SLAVE
evbarm-el GUMSTIX HDL_G IMX31LITE INTEGRATOR IQ31244 IQ80310 IQ80321
evbarm-el IXDP425 IXM1200 KUROBOX_PRO
evbarm-el LUBBOCK MARVELL_NAS NAPPI NSLU2 SHEEVAPLUG SMDK2800 TEAMASA_NPWR
evbarm-el TEAMASA_NPWR_FC TS7200 TWINTAIL ZAO425
evbmips-el AP30 DBAU1500 DBAU1550 MALTA MERAKI MTX-1 OMSAL400 RB153 WGT624V3
evbmips64-el XLSATX
evbppc EV64260 MPC8536DS MPC8548CDS OPENBLOCKS200 OPENBLOCKS266
evbppc OPENBLOCKS266_OPT P2020RDB PMPPC RB800 WALNUT
hp700 GENERIC
i386 ALL XEN3_DOM0 XEN3_DOMU
ibmnws GENERIC
iyonix GENERIC
landisk GENERIC
macppc GENERIC
mvmeppc GENERIC
netwinder GENERIC
ofppc GENERIC
prep GENERIC
sandpoint GENERIC
sbmips-el GENERIC
sgimips GENERIC32_IP2x GENERIC32_IP3x
sparc GENERIC_SUN4U KRUPS
sparc64 GENERIC
2011-05-17 17:34:47 +00:00
jym
f4853d4d88 As noted by rmind@, use the _nv() to fetch the new value. A race is
possible between the decrement and the fetch of the ref counter value,
hence we might call the G/C routine twice. Not good.

Also remove the 'volatile' attribute, refcnt is only use by xbdi_put/_get
and should not be exposed anywhere else (except for initialization).
2011-05-15 20:58:54 +00:00
jym
264d3132bf Use atomic_ops(3) for ref counting. 2011-05-15 07:24:15 +00:00
jym
e1b3bebc56 Print the PGD address in the debug message. 2011-05-08 00:18:25 +00:00
jym
bf7437481f Move the connection code of xbdback(4) and xvif(4) backends in separate
functions. The frontend watch function is easier to read, and mixing
switch() with goto's error paths is rather error-prone.

While here, sprinkle some aprint_*.

Tested under amd64 dom0 with i386 PAE and amd64 domUs.
2011-04-29 22:58:46 +00:00
jym
13cf826eb8 Silence xenbus_read_target() in ENOENT case (== entry is missing from
Xenstore). The error case does not bring much here; assume that the value
is 0.

Print the error code when writing the ``target'' value fails.
2011-04-29 22:52:02 +00:00
jym
21afd04a13 Apply DRY: xpmap_{mtop,ptom}() can reuse xpmap_{mtop,ptom}_masked() for
the frame number lookup.

No functional change.
2011-04-29 22:45:41 +00:00
joerg
787e55aa29 Remove PECOFF/Win32 emulation. 2011-04-26 16:57:38 +00:00
joerg
5aca2679d7 Remove Darwin, MACH and Mach-O support. 2011-04-26 15:51:22 +00:00
jym
21dd288af2 Check status before proceeding further. Avoids spurious watch calls. 2011-04-25 17:01:54 +00:00
jym
6a2ec1520e use __KERNEL_RCSID() 2011-04-25 00:22:37 +00:00
jym
ee59eafa3f Check that xvif(4) is not already connected before proceeding in the
XenbusStateConnected mode. Under rare occasions, the xenbus watcher
can fire multiple times, overwriting the I/O ring memory mappings with
invalid values. This will lead sooner or later to dom0 panic().

Will ask for pullup. FWIW, xbdback(4) is not affected.
2011-04-25 00:14:06 +00:00
jym
f5481a4681 Separate xennet(4) backend initialization code ("resume") from the part
that talks with Xenstore to query backend's information. Resuming is now
performed just after xennet(4) attachment instead of waiting for backend
to announce its features in Xenstore and change it state.

This fixes the race observed by Urban Boquist when the domU boots with
root on NFS.

FWIW, the boot code (when root is NFS-backed) can innit() the xennet(4)
interface very early: it tried to access ifnet structures that were not
yet allocated.

Will ask for a pullup. Thanks to Urban for reporting the issue and
investigate it. Confirmed fixed. No regression observed by me for
dynamic attach/detach of xvif(4) and xennet(4) interfaces.

See also http://mail-index.netbsd.org/port-xen/2011/04/18/msg006647.html
2011-04-25 00:00:50 +00:00
rmind
2626d57668 Rename ttymalloc() to tty_alloc(), and ttyfree() to tty_free() for
consistency.  Remove some unnecessary malloc.h inclusions as well.
2011-04-24 16:26:51 +00:00
jym
7bda9dcf91 Disestablish softint in the error path. 2011-04-21 13:06:20 +00:00
jym
e1b18c561e Unmap rings before freeing their associated VAs, or we will get a
non-recoverable fault in the error path.
2011-04-20 20:32:38 +00:00
rmind
403e4e6cb1 balloon_xenbus_attach: use KM_SLEEP for allocation.
Note: please do not use KM_NOSLEEP.
2011-04-18 03:04:31 +00:00
jym
6b3956d6d5 Large rewrite of the balloon driver. This one:
- turns balloon into a driver that attaches to xenbus(4). This allows to
disable the functionality either at compile time or boot time via
userconf(4). Driver can implement detach or pmf(9) hooks if deemed
necessary.

- keeps Cherry's locking model, but simplify it a bit. There is now
only one target value serialized inside balloon, we do not feedback
alternative value to Xenstore (clients are not expected to see its value
evolve behind their back, and can't do much about that either)

- implements min threshold; this is an admin-settable value that tells
driver to "not balloon below this threshold." This can be used by domain
to keep memory reservations, useful if activity is expected in the near
future.

- in addition to min threshold, the driver implements internally a
safeguard value (uvmexp.freemin + 1MiB), so that admin cannot
inadvertently set min to a very low value forcing domain into heavy
memory pressure and swapping.

- create the sysctl(8) kern.xen.balloon tree. 4 nodes are actually present
(values are in KiB):
   - min: (rw) an admin-settable value that prevents ballooning below this
          mark
   - max: (ro) the maximum size for reservation, as set by xm(1) mem-max.
   - current: (ro) the current reservation for domain.
   - target:  (rw) the targetted reservation for domain.

- fix a few limitations here and there, most notably the max_reservation
hypercall, and KiB vs pages representations at interfaces.

The driver is still turned off by default. Enabling it would need more
approval, especially from bouyer@, cherry@ and cegger@.

FWIW: tested it two days long, from amd64 dom0 (with dom0 ballooning
enabled for xend), and bunch of domUs. Did not notice anything suspicious.

XXX it still has one big limitation: it cannot hotplug memory pages in
uvm(9) if they were not present beforehand. Example: ballooning above
physmem will give more pages to domain but it won't use it to serve
allocations, unless we teach uvm(9) how to handle the extra pages.
2011-04-18 01:36:24 +00:00
jym
f23b41b209 Remove remnants from the past when Xen 2 was still alive. 2011-04-17 23:54:05 +00:00
mrg
648bc71c1a apply some _KERNEL_OPT. 2011-04-17 09:50:33 +00:00
cegger
448664b81b previous fix does not work if there is exactly only one entry where continue
exits the loop.
Apply fix from Konrad Wilke on port-xen@
That makes NetBSD DomU boot on Linux Dom0 with xl.
2011-04-12 05:09:32 +00:00
cegger
ae26631de3 Continue scanning for other frontends when initialization
of one frontend failed. Bug reported by Konrad Wilk on port-xen@.
Fix this for all error pathes within the loop.
2011-04-11 15:00:49 +00:00
cegger
d13fff3778 build xen kernels again after db_trace merge 2011-04-11 08:56:17 +00:00
jym
d833e65927 Alright, set xvif(4) syntax once and for all. Tested with ipf(4) under
XEN3_DOM0 amd64.

Fixes PR misc/39376.

See http://mail-index.netbsd.org/port-xen/2011/04/06/msg006603.html
2011-04-06 23:51:55 +00:00
dyoung
d3e53912d2 Neither pci_dma64_available(), pci_probe_device(), pci_mapreg_map(9),
pci_find_rom(), pci_intr_map(9), pci_enumerate_bus(), nor the match
predicate passed to pciide_compat_intr_establish() should ever modify
their pci_attach_args argument, so make their pci_attach_args arguments
const and deal with the fallout throughout the kernel.

For the most part, these changes add a 'const' where there was no
'const' before, however, some drivers and MD code used to modify
pci_attach_args.  Now those drivers either copy their pci_attach_args
and modify the copy, or refrain from modifying pci_attach_args:

Xen: according to Manuel Bouyer, writing to pci_attach_args in
    pci_intr_map() was a leftover from Xen 2.  Probably a bug.  I
    stopped writing it.  I have not tested this change.

siside(4): sis_hostbr_match() needlessly wrote to pci_attach_args.
    Probably a bug.  I use a temporary variable.  I have not tested this
    change.

slide(4): sl82c105_chip_map() overwrote the caller's pci_attach_args.
    Probably a bug.  Use a local pci_attach_args.  I have not tested
    this change.

viaide(4): via_sata_chip_map() and via_sata_chip_map_new() overwrote the
    caller's pci_attach_args.  Probably a bug.  Make a local copy of the
    caller's pci_attach_args and modify the copy.  I have not tested
    this change.

While I'm here, make pci_mapreg_submap() static.

With these changes in place, I have tested the compilation of these
kernels:

alpha GENERIC
amd64 GENERIC XEN3_DOM0
arc GENERIC
atari HADES MILAN-PCIIDE
bebox GENERIC
cats GENERIC
cobalt GENERIC
evbarm-eb NSLU2
evbarm-el ADI_BRH ARMADILLO9 CP3100 GEMINI GEMINI_MASTER GEMINI_SLAVE GUMSTIX
	HDL_G IMX31LITE INTEGRATOR IQ31244 IQ80310 IQ80321 IXDP425 IXM1200
	KUROBOX_PRO LUBBOCK MARVELL_NAS NAPPI SHEEVAPLUG SMDK2800 TEAMASA_NPWR
	TEAMASA_NPWR_FC TS7200 TWINTAIL ZAO425
evbmips-el AP30 DBAU1500 DBAU1550 MALTA MERAKI MTX-1 OMSAL400 RB153 WGT624V3
evbmips64-el XLSATX
evbppc EV64260 MPC8536DS MPC8548CDS OPENBLOCKS200 OPENBLOCKS266
	OPENBLOCKS266_OPT P2020RDB PMPPC RB800 WALNUT
hp700 GENERIC
i386 ALL XEN3_DOM0 XEN3_DOMU
ibmnws GENERIC
macppc GENERIC
mvmeppc GENERIC
netwinder GENERIC
ofppc GENERIC
prep GENERIC
sandpoint GENERIC
sgimips GENERIC32_IP2x
sparc GENERIC_SUN4U KRUPS
sparc64 GENERIC

As of Sun Apr 3 15:26:26 CDT 2011, I could not compile these kernels
with or without my patches in place:

### evbmips-el GDIUM

nbmake: nbmake: don't know how to make /home/dyoung/pristine-nbsd/src/sys/arch/mips/mips/softintr.c. Stop

### evbarm-el MPCSA_GENERIC
src/sys/arch/evbarm/conf/MPCSA_GENERIC:318: ds1672rtc*: unknown device `ds1672rtc'

### ia64 GENERIC

/tmp/genassym.28085/assym.c: In function 'f111':
/tmp/genassym.28085/assym.c:67: error: invalid application of 'sizeof' to incomplete type 'struct pcb'
/tmp/genassym.28085/assym.c:76: error: dereferencing pointer to incomplete type

### sgimips GENERIC32_IP3x

crmfb.o: In function `crmfb_attach':
crmfb.c:(.text+0x2304): undefined reference to `ddc_read_edid'
crmfb.c:(.text+0x2304): relocation truncated to fit: R_MIPS_26 against `ddc_read_edid'
crmfb.c:(.text+0x234c): undefined reference to `edid_parse'
crmfb.c:(.text+0x234c): relocation truncated to fit: R_MIPS_26 against `edid_parse'
crmfb.c:(.text+0x2354): undefined reference to `edid_print'
crmfb.c:(.text+0x2354): relocation truncated to fit: R_MIPS_26 against `edid_print'
2011-04-04 20:37:49 +00:00
jym
8bbd42f10a Now that pkgsrc-2011Q1 has arrived, and before -6 chimes in, change
ifxname for xvif(4) from xvif%d.%d to xvif%d-%d. This is needed
to avoid sysctl(9) EINVAL errors when creating interface nodes.

See http://mail-index.netbsd.org/port-xen/2011/01/11/msg006405.html
2011-04-03 23:21:37 +00:00
dyoung
7b673ebd9e Clean up excessive #ifdef'age of NMI trap handling for amd64/i386/xen.
Handle NMI in all Xen kernels.
2011-04-03 22:29:25 +00:00
jym
aef6dbf4fe Add the HYPERVISOR_sysctl() hypercall.
Although the hypercall arguments (like struct sysctl_readconsole) are not
compatible between different XEN_SYSCTL_INTERFACE_VERSIONs (one of the
reasons why the sysctl calls should only be used by xentools directly),
it's still practical to have when one wants to query Xen's dmesg from
ddb(4) in case of a panic.

Note: additional code is needed for readconsole() functionality, but adding
the hypercall should not cause any harm.
2011-03-30 22:57:24 +00:00
jym
2f91f085df (purely cosmetic changes)
- Use free_otherend_details() instead of calling free() on xbusd_otherend.
- rename talk_to_otherend() to watch_otherend(). We register a watch for
changes in the otherend device "state"; we are not really talking to it.
- add missing prototypes.
2011-03-30 22:34:03 +00:00
jym
fc848a1000 Fix a year old bug that was only fixed in jym-xensuspend branch, but
not in HEAD:
- use uvm_km_alloc() instead of kmem_alloc() to enforce alignement when
allocating p2m_frame pages (xentools can only deal with page-aligned
addresses)
- do not use paddr_t for p2m_frame_list_list with PAE, xentools expect
32 bits PFNs even with 64 bits PTE.

Required to make ``xm dump-core'' work as expected.
2011-03-30 21:53:58 +00:00
jym
61a25d36c6 Do not clobber autoconf messages (and variables, for error reporting)
in xennet(4) handler. Keep if_xname though.

Reported by dyoung@, thanks.
2011-03-30 18:33:05 +00:00
jym
fad92a48f9 printf("xennet: ...") => aprint_error_ifnet() 2011-03-30 00:17:04 +00:00
jym
3b2b362830 Remove spurious spaces. 2011-03-30 00:13:28 +00:00
jym
26f424e994 Typo fix. 2011-03-29 23:51:32 +00:00
bouyer
0ccdc9a448 Test and set xbdi->xbdi_cont at splbio(). Otherwise we could overwrite
xbdi->xbdi_cont and process the same request twice.
2011-03-05 15:12:16 +00:00
jruoho
4939599599 Use config_defer(9) for cpu_rescan() in cpu_attach().
Also mark few local functions as static.
2011-02-26 14:43:18 +00:00
jruoho
39c7a68be3 Catch up with x86 on cpufeaturebus. 2011-02-24 19:00:58 +00:00
jruoho
ad932f2c35 Move PowerNow! to the cpufeaturebus. 2011-02-24 10:56:00 +00:00
jruoho
e54ba5b4d0 Add cpufeaturebus and est(4) for Xen. 2011-02-24 04:42:54 +00:00
jruoho
acdf26369f Move ENHANCED_SPEEDSTEP, or henceforth est(4), to the cpufeaturebus. 2011-02-23 11:43:21 +00:00
jym
71d70847f6 Use only one function to pin pages with Xen, and provide macros to
call it for different levels (L1 => L4).

Replace all calls to xpq_queue_pin_table(...) in MD code with these new
functions, with proper #ifdef'ing depending on $MACHINE.

Rationale:
- only one function to modify for logging
- pushes responsibility to caller for chosing the proper pin level, rather
than Xen internal functions; this makes the pin level explicit rather than
implicit.

Boot tested for dom0 i386/amd64, PAE included. No functional change intended.
2011-02-10 00:23:14 +00:00
chuck
2ae83d14e9 udpate license clauses on my code to match the new-style BSD licenses.
based on diff that rmind@ sent me.

no functional change with this commit.
2011-02-01 21:18:56 +00:00
jym
b767622b62 Fix bad pasto cooking: xennet backend is not xbdback: use
aprint_error_ifnet() with the proper ifnet device for error messages.
2011-01-18 21:34:31 +00:00
joerg
d3a052c472 Allow use of traditional CPP to be set on a per platform base in sys.mk.
Honour this for dependency processing in bsd.dep.mk.  Switch i386 and
amd64 assembly to use ISO C90 preprocessor concat and drop the
-traditional-cpp on this platform.
2011-01-12 23:12:10 +00:00
jym
c7b98903de Introduce "vifname" keys for Xen domains. Its value is the interface
name for the vif, e.g. xvif(4) for dom0, and xennet(4) for domU.

ok bouyer@.

See http://mail-index.netbsd.org/port-xen/2011/01/11/msg006405.html
2011-01-11 23:22:19 +00:00
jym
2d351d332f Typo fix. 2011-01-11 01:21:32 +00:00
cegger
e1f9b2b091 fix typo in ioctl definition 2011-01-10 11:13:03 +00:00
jym
4c59a64111 Move if_xname setting earlier for xvif creation, so we can grab domid
and handle values sooner for error cases.
2011-01-08 05:23:19 +00:00
jym
6643cbe696 Now, get the return error too, in case that could help with EC2
troubleshooting...
2010-12-20 21:18:45 +00:00
matt
6a66466f0c Move counting of faults, traps, intrs, soft[intr]s, syscalls, and nswtch
from uvmexp to per-cpu cpu_data and move them to 64bits.  Remove unneeded
includes of <uvm/uvm_extern.h> and/or <uvm/uvm.h>.
2010-12-20 00:25:23 +00:00
jym
71c111927d Need the successful count (for AMI debugging) 2010-12-19 23:23:35 +00:00
cegger
5932e64165 add privcmd ioctl that got introduced with Xen 4 2010-12-15 14:45:47 +00:00
cegger
48198f2ade add gnttab ioctl definitions to implement 2010-12-15 14:28:22 +00:00
bouyer
fbd333145d Make maxpartitions 16 on !i386. Fixes hardwiring root on device autoconf
index > 0 on amd64. Problem reported and patch tested by Tobias Nygren.
2010-12-02 23:12:30 +00:00
bouyer
4621c0ed04 Boot vs AP processors don't make sense for physical CPUs, these are
handled by the hypervisor and all CPUs are running when the dom0 is started.
In addition, we don't have a reliable way to determine the boot CPU as
- we may not be running on the boot CPU
- we don't have access to the lapic id
So simplify by ignoring the information and assign phycpu_info_primary to the
first attached CPU.
2010-11-14 13:43:04 +00:00
bouyer
fd46720a12 Explain why we hardwire lapic_cpu_number() to 0 on Xen. 2010-11-14 13:40:31 +00:00
uebayasi
52232a9d0d Pull in uvm/uvm.h where UVM's page level interface is used. 2010-11-12 13:18:56 +00:00