Commit Graph

527 Commits

Author SHA1 Message Date
Umesh Deshpande
b2a8658ef5 protect the ramlist with a separate mutex
Add the new mutex that protects shared state between ram_save_live
and the iothread.  If the iothread mutex has to be taken together
with the ramlist mutex, the iothread shall always be _outside_.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Umesh Deshpande <udeshpan@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2012-12-20 23:08:47 +01:00
Umesh Deshpande
f798b07f51 add a version number to ram_list
This will be used to detect if last_block might have become invalid
across different calls to ram_save_live.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Umesh Deshpande <udeshpan@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
2012-12-20 23:08:47 +01:00
Paolo Bonzini
abb26d63e7 exec: sort the memory from biggest to smallest
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-12-20 23:08:47 +01:00
Paolo Bonzini
a3161038a1 exec: change RAM list to a TAILQ
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-12-20 23:08:47 +01:00
Paolo Bonzini
0d6d3c87a2 exec: change ramlist from MRU order to a 1-item cache
Most of the time, only 2 items will be active (from/to for a string operation,
or code/data).  But TCG guests likely won't have gigabytes of memory, so
this actually goes down to 1 item.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-12-20 23:08:40 +01:00
Paolo Bonzini
9c17d615a6 softmmu: move include files to include/sysemu/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:32:45 +01:00
Paolo Bonzini
1de7afc984 misc: move include files to include/qemu/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:32:39 +01:00
Paolo Bonzini
022c62cbbc exec: move include files to include/exec/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:31:31 +01:00
Paolo Bonzini
077805fa92 janitor: do not rely on indirect inclusions of or from qemu-char.h
Various header files rely on qemu-char.h including qemu-config.h or
main-loop.h, but they really do not need qemu-char.h at all (particularly
interesting is the case of the block layer!).  Clean this up, and also
add missing inclusions of qemu-char.h itself.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:29:52 +01:00
Blue Swirl
5b6dd8683d exec: move TB handling to translate-all.c
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-12-16 08:28:41 +00:00
Blue Swirl
5a3165263a exec: extract TB watchpoint check
Will be moved by the next patch.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-12-16 08:28:29 +00:00
Blue Swirl
44209fc4ed exec: fix coding style
Fix coding style in areas to be moved by later patches.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-12-16 08:28:16 +00:00
Richard Henderson
0be4835b49 exec: Advise huge pages for the TCG code gen buffer
After allocating 32MB or more contiguous memory, huge pages
would seem to be ideal.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-12-08 14:18:37 +00:00
Peter Maydell
9e11908f12 dma: Define dma_context_memory and use in sysbus-ohci
Define a new global dma_context_memory which is a DMAContext corresponding
to the global address_space_memory AddressSpace. This can be used by
sysbus peripherals like sysbus-ohci which need to do DMA.

In particular, use it in the sysbus-ohci device, which fixes a
segfault when attempting to use that device.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
2012-11-12 16:44:57 +01:00
Blue Swirl
ef84755ebb Merge branch 'trivial-patches' of git://github.com/stefanha/qemu
* 'trivial-patches' of git://github.com/stefanha/qemu:
  pc: Drop redundant test for ROM memory region
  exec: make some functions static
  target-ppc: make some functions static
  ppc: add missing static
  vnc: add missing static
  vl.c: add missing static
  target-sparc: make do_unaligned_access static
  m68k: Return semihosting errno values correctly
  cadence_uart: More debug information

Conflicts:
	target-m68k/m68k-semi.c
2012-11-03 12:55:05 +00:00
Yeongkyoon Lee
fdbb84d133 tcg: Add extended GETPC mechanism for MMU helpers with ldst optimization
Add GETPC_EXT which is used by MMU helpers to selectively calculate the code
address of accessing guest memory when called from a qemu_ld/st optimized code
or a C function. Currently, it supports only i386 and x86-64 hosts.

Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-11-03 09:44:20 +00:00
Blue Swirl
8b9c99d9dc exec: make some functions static
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2012-11-01 19:49:45 +01:00
Andreas Färber
9f09e18a6d cpu: Move thread_id to CPUState
Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-10-31 04:12:23 +01:00
Andreas Färber
c08d7424d6 cpus: Pass CPUState to qemu_cpu_kick()
CPUArchState is no longer needed there.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-10-31 01:02:45 +01:00
Andreas Färber
60e82579c7 cpus: Pass CPUState to qemu_cpu_is_self()
Change return type to bool, move to include/qemu/cpu.h and
add documentation.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
[AF: Updated new caller qemu_in_vcpu_thread()]
2012-10-31 01:02:39 +01:00
Avi Kivity
a8170e5e97 Rename target_phys_addr_t to hwaddr
target_phys_addr_t is unwieldly, violates the C standard (_t suffixes are
reserved) and its purpose doesn't match the name (most target_phys_addr_t
addresses are not target specific).  Replace it with a finger-friendly,
standards conformant hwaddr.

Outstanding patchsets can be fixed up with the command

  git rebase -i --exec 'find -name "*.[ch]"
                        | xargs s/target_phys_addr_t/hwaddr/g' origin

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-10-23 08:58:25 -05:00
Luiz Capitulino
ad0b5321f1 Call MADV_HUGEPAGE for guest RAM allocations
This makes it possible for QEMU to use transparent huge pages (THP)
when transparent_hugepage/enabled=madvise. Otherwise THP is only
used when it's enabled system wide.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-10-22 13:26:34 -05:00
Anthony Liguori
f526f3c315 Merge remote-tracking branch 'quintela/migration-next-20121017' into staging
* quintela/migration-next-20121017: (41 commits)
  cpus: create qemu_in_vcpu_thread()
  savevm: make qemu_file_put_notify() return errors
  savevm: un-export qemu_file_set_error()
  block-migration: handle errors with the return codes correctly
  block-migration:  Switch meaning of return value
  block-migration: make flush_blks() return errors
  buffered_file: buffered_put_buffer() don't need to set last_error
  savevm: Only qemu_fflush() can generate errors
  savevm: make qemu_fill_buffer() be consistent
  savevm: unexport qemu_ftell()
  savevm: unfold qemu_fclose_internal()
  savevm: make qemu_fflush() return an error code
  savevm: Remove qemu_fseek()
  virtio-net: use qemu_get_buffer() in a temp buffer
  savevm: unexport qemu_fflush
  migration: make migrate_fd_wait_for_unfreeze() return errors
  buffered_file: make buffered_flush return the error code
  buffered_file: callers of buffered_flush() already check for errors
  buffered_file: We can access directly to bandwidth_limit
  buffered_file: unfold migrate_fd_close
  ...
2012-10-22 13:26:23 -05:00
Anthony Liguori
d3e2efc5b5 Merge remote-tracking branch 'qemu-kvm/memory/dma' into staging
* qemu-kvm/memory/dma: (23 commits)
  pci: honor PCI_COMMAND_MASTER
  pci: give each device its own address space
  memory: add address_space_destroy()
  dma: make dma access its own address space
  memory: per-AddressSpace dispatch
  s390: avoid reaching into memory core internals
  memory: use AddressSpace for MemoryListener filtering
  memory: move tcg flush into a tcg memory listener
  memory: move address_space_memory and address_space_io out of memory core
  memory: manage coalesced mmio via a MemoryListener
  xen: drop no-op MemoryListener callbacks
  kvm: drop no-op MemoryListener callbacks
  xen_pt: drop no-op MemoryListener callbacks
  vfio: drop no-op MemoryListener callbacks
  memory: drop no-op MemoryListener callbacks
  memory: provide defaults for MemoryListener operations
  memory: maintain a list of address spaces
  memory: export AddressSpace
  memory: prepare AddressSpace for exporting
  xen_pt: use separate MemoryListeners for memory and I/O
  ...
2012-10-22 13:26:07 -05:00
Avi Kivity
83f3c25142 memory: add address_space_destroy()
Since address spaces can be created dynamically by device hotplug, they
can also be destroyed dynamically.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22 14:50:08 +02:00
Avi Kivity
ac1970fbe8 memory: per-AddressSpace dispatch
Currently we use a global radix tree to dispatch memory access.  This only
works with a single address space; to support multiple address spaces we
make the radix tree a member of AddressSpace (via an intermediate structure
AddressSpaceDispatch to avoid exposing too many internals).

A side effect is that address_space_io also gains a dispatch table.  When
we remove all the pre-memory-API I/O registrations, we can use that for
dispatching I/O and get rid of the original I/O dispatch.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22 14:50:08 +02:00
Avi Kivity
f6790af6bc memory: use AddressSpace for MemoryListener filtering
Using the AddressSpace type reduces confusion, as you can't accidentally
supply the MemoryRegion you're interested in.

Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22 14:50:07 +02:00
Avi Kivity
1d71148eac memory: move tcg flush into a tcg memory listener
We plan to make the core listener listen to all address spaces; this
will cause many more flushes than necessary.  Prepare for that by
moving the flush into a tcg-specific listener.

Later we can avoid registering the listener if tcg is disabled.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22 14:50:07 +02:00
Avi Kivity
2673a5da25 memory: move address_space_memory and address_space_io out of memory core
With this change, memory.c no longer knows anything about special address
spaces, so it is prepared for AddressSpace based DMA.

Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22 14:50:07 +02:00
Avi Kivity
95d2994a2f memory: manage coalesced mmio via a MemoryListener
Instead of calling a global function on coalesced mmio changes, which
routes the call to kvm if enabled, add coalesced mmio hooks to
MemoryListener and make kvm use that instead.

The motivation is support for multiple address spaces (which means we
we need to filter the call on the right address space) but the result
is cleaner as well.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-22 14:50:00 +02:00
Richard Henderson
74d590c8e9 exec: Make MIN_CODE_GEN_BUFFER_SIZE private to exec.c
It is used nowhere else, and the corresponding MAX_CODE_GEN_BUFFER_SIZE
also lives there.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-10-20 07:54:04 +00:00
Richard Henderson
4438c8a946 exec: Allocate code_gen_prologue from code_gen_buffer
We had a hack for arm and sparc, allocating code_gen_prologue to a
special section.  Which, honestly does no good under certain cases.
We've already got limits on code_gen_buffer_size to ensure that all
TBs can use direct branches between themselves; reuse this limit to
ensure the prologue is also reachable.

As a bonus, we get to avoid marking a page of the main executable's
data segment as executable.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-10-20 07:54:04 +00:00
Richard Henderson
405def1846 exec: Do not use absolute address hints for code_gen_buffer with -fpie
The hard-coded addresses inside alloc_code_gen_buffer only make sense
if we're building an executable that will actually run at the address
we've put into the linker scripts.

When we're building with -fpie, the executable will run at some
random location chosen by the kernel.  We get better placement for
the code_gen_buffer if we allow the kernel to place the memory,
as it will tend to to place it near the executable, based on the
PROT_EXEC bit.

Since code_gen_prologue is always inside the executable, this effect
is easily seen at the end of most TB, with the exit_tb opcode, and
with any calls to helper functions.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-10-20 07:54:04 +00:00
Richard Henderson
3d85a72fd8 exec: Don't make DEFAULT_CODE_GEN_BUFFER_SIZE too large
For ARM we cap the buffer size to 16MB.  Do not allocate 32MB in that case.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-10-20 07:54:04 +00:00
Richard Henderson
f1bc0bcc9d exec: Split up and tidy code_gen_buffer
It now consists of:

A macro definition of MAX_CODE_GEN_BUFFER_SIZE with host-specific values,

A function size_code_gen_buffer that applies most of the reasoning for
choosing a buffer size,

Three variations of a function alloc_code_gen_buffer that contain all
of the logic for allocating executable memory via a given allocation
mechanism.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-10-20 07:54:04 +00:00
Juan Quintela
652d7ec291 ram: Export last_ram_offset()
Is the only way of knowing the RAM size.

Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2012-10-17 18:34:58 +02:00
Avi Kivity
9a2c913b77 memory: drop no-op MemoryListener callbacks
Removes quite a bit of useless code.

Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-15 11:43:07 +02:00
Avi Kivity
7762c2c1e0 memory: rename 'exec-obsolete.h'
exec-obsolete.h used to hold pre-memory-API functions that were used from
device code prior to the transition to the memory API.  Now that the
transition is complete, the name no longer describes the file.  The
functions still need to be merged better into the memory core, but there's
no danger of anyone using them.

Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-15 11:43:05 +02:00
Peter Maydell
6fd2a026fb cpu_dump_state: move DUMP_FPU and DUMP_CCOP flags from x86-only to generic
Move the DUMP_FPU and DUMP_CCOP flags for cpu_dump_state() from being
x86-specific flags to being generic ones. This allows us to drop some
TARGET_I386 ifdefs in various places, and means that we can (potentially)
be more consistent across architectures about which monitor commands or
debug abort printouts include FPU register contents and info about
QEMU's condition-code optimisations.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2012-10-05 15:04:43 +01:00
Anthony PERARD
e226939de5 exec, memory: Call to xen_modified_memory.
This patch add some calls to xen_modified_memory to notify Xen about dirtybits
during migration.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Avi Kivity <avi@redhat.com>
2012-10-03 13:49:22 +00:00
Anthony PERARD
51d7a9eb2b exec: Introduce helper to set dirty flags.
This new helper/hook is used in the next patch to add an extra call in a single
place.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Avi Kivity <avi@redhat.com>
2012-10-03 13:49:05 +00:00
Richard Henderson
9b9c37c364 tcg-sparc: Assume v9 cpu always, i.e. force v8plus in 32-bit mode.
Current code doesn't actually work in 32-bit mode at all.  Since
no one really noticed, drop the complication of v7 and v8 cpus.
Eliminate the --sparc_cpu configure option and standardize macro
testing on TCG_TARGET_REG_BITS / HOST_LONG_BITS

Signed-off-by: Richard Henderson <rth@twiddle.net>
2012-09-21 22:02:16 +02:00
Richard Henderson
d5dd696fe3 tcg-sparc: Don't MAP_FIXED on top of the program
The address we pick in sparc64.ld is also 0x60000000, so doing a fixed map
on top of that is guaranteed to blow up.  Choosing 0x40000000 is exactly
right for the max of code_gen_buffer_size set below.

No need to ever use MAP_FIXED.  While getting our desired address helps
optimize the generated code, we won't fail if we don't get it.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2012-09-21 22:02:16 +02:00
David Gibson
0b57e28713 cpu_physical_memory_write_rom() needs to do TB invalidates
cpu_physical_memory_write_rom(), despite the name, can also be used to
write images into RAM - and will often be used that way if the machine
uses load_image_targphys() into RAM addresses.

However, cpu_physical_memory_write_rom(), unlike cpu_physical_memory_rw()
doesn't invalidate any cached TBs which might be affected by the region
written.

This was breaking reset (under full emu) on the pseries machine - we loaded
our firmware image into RAM, and while executing it rewrite the code at
the entry point (correctly causing a TB invalidate/refresh).  When we
reset the firmware image was reloaded, but the TB from the rewrite was
still active and caused us to get an illegal instruction trap.

This patch fixes the bug by duplicating the tb invalidate code from
cpu_physical_memory_rw() in cpu_physical_memory_write_rom().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-09-17 10:18:48 -05:00
Luiz Capitulino
8490fc78e7 add -machine mem-merge=on|off option
It allows to disable memory merge support (KSM on Linux), which is
enabled by default otherwise.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-09-17 10:18:47 -05:00
Jason Baron
ddb97f1deb memory: add -machine dump-guest-core=on|off
Add a new '[,dump-guest-core=on|off]' option to the '-machine' option. When
'dump-guest-core=off' is specified, guest memory is omitted from the core dump.
The default behavior continues to be to include guest memory when a core dump is
triggered. In my testing, this brought the core dump size down from 384MB to 6MB
on a 2GB guest.

Is anything additional required to preserve this setting for migration or
savevm? I don't believe so.

Changelog:
v3:
    Eliminate globals as per Anthony's suggestion
    set no dump from qemu_ram_remap() as well
v2:
    move the option from -m to -machine, rename option dump -> dump-guest-core

Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-08-16 13:41:15 -05:00
Igor Mitsyanko
5fda043f9c exec.c: fix dirty bitmap reallocation
For each newly created RAM block, dirty bitmap is reallocated with g_realloc, which doesn't
make any promises on initial content of new extra data in returned buffer. In theory,
we initialize this new data with cpu_physical_memory_set_dirty_range() call. The
problem is, cpu_physical_memory_set_dirty_range() has a side effect of incrementing
ram_list.dirty_pages variable, but only for pages which are not already dirty. And
page "cleanliness" is determined using the same not yet uninitialized dirty bitmap
we've just reallocated. This results in inconsistency between real dirty page number
and value in ram_list.dirty_pages variable, which in turn could (and will) result
in errors during VM migration.
Zero initialize new dirty bitmap bytes to fix this problem.

Signed-off-by: Igor Mitsyanko <i.mitsyanko@samsung.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-08-11 12:23:46 +00:00
Peter Maydell
c308efe63a exec.c: Remove out of date comment
Remove an out of date comment: this comment used to be attached to
cpu_register_physical_memory_log(), before commit 0f0cb164 accidentally
inserted a couple of other functions between the comment and its function.
It is in any case obsolete since (a) the function arguments it refers
to have been replaced with a single MemoryRegionSection* argument and
(b) the inability to handle regions whose offset_within_address_space
and offset_within_region aren't equally aligned was fixed as part of
the rewrite of this code.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-08-03 14:25:22 +01:00
Tyler Hall
69b67646bc exec.c: Use subpages for large unaligned mappings
Registering a multi-page memory region that is non-page-aligned results
in a subpage from the start to the page boundary, some number of full
pages, and possibly another subpage from the last page boundary to the
end. The full pages will have a value for offset_within_region that is
not a multiple of TARGET_PAGE_SIZE. Accesses through softmmu are unable
to handle this and will segfault.

Handling full pages through subpages is not optimal, but only
non-page-aligned mappings take the penalty.

Signed-off-by: Tyler Hall <tylerwhall@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-08-03 14:25:22 +01:00
Tyler Hall
adb2a9b5d4 exec.c: Fix off-by-one error in register_subpage
subpage_register() expects "end" to be the last byte in the mapping.
Registering a non-page-aligned memory region that extends up to or
beyond a page boundary causes subpage_register() to silently fail
through the (end >= PAGE_SIZE) check.

This bug does not cause noticeable problems for mappings that do not
extend to a page boundary, though they do register an extra byte.

Signed-off-by: Tyler Hall <tylerwhall@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-08-03 14:25:22 +01:00