Set cpu->running without taking the cpu_list lock, only requiring it if
there is a concurrent exclusive section. This requires adding a new
field to CPUState, which records whether a running CPU is being counted
in pending_cpus.
When an exclusive section is started concurrently with cpu_exec_start,
cpu_exec_start can use the new field to determine if it has to wait for
the end of the exclusive section. Likewise, cpu_exec_end can use it to
see if start_exclusive is waiting for that CPU.
This a separate patch for easier bisection of issues.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>