Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Message-Id: <SYCPR01MB3502E9F6EB06DEDCD484F738FCBA9@SYCPR01MB3502.ausprd01.prod.outlook.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If previous write commands write the same length of data with the same step,
we view it as a hint.
Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <SYCPR01MB3502480AD07811A6A49B8FEAFCAB0@SYCPR01MB3502.ausprd01.prod.outlook.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
-M1: remove IO commands iteratively
-M2: try setting bits in operand of write/out to zero
Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <SYCPR01MB350204C52E7A39E6B0EEC870FCAB0@SYCPR01MB3502.ausprd01.prod.outlook.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Simplifying the crash cases by opportunistically setting bits in operands of
out/write to zero may help to debug, since usually bit one means turn on or
trigger a function while zero is the default turn-off setting.
Tested bug https://bugs.launchpad.net/qemu/+bug/1908062
Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <SYCPR01MB3502C84B6346A3E3DE708C7BFCAB0@SYCPR01MB3502.ausprd01.prod.outlook.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Now we use a one-time scan and remove strategy in the minimizer,
which is not suitable for timing dependent instructions.
For example, instruction A will indicate an address where the config
chunk locates, and instruction B will make the configuration active.
If we have the following instruction sequence:
...
A1
B1
A2
B2
...
A2 and B2 are the actual instructions that trigger the bug.
If we scan from top to bottom, after we remove A1, the behavior of B1
might be unknowable, including not to crash the program. But we will
successfully remove B1 later cause A2 and B2 will crash the process
anyway:
...
A1
A2
B2
...
Now one more trimming will remove A1.
In the perfect case, we would need to be able to remove A and B (or C!) at
the same time. But for now, let's just add a loop around the minimizer.
Since we only remove instructions, this iterative algorithm is converging.
Tested with Bug 1908062.
Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <SYCPR01MB350263004448040ACCB9A9F1FCAB0@SYCPR01MB3502.ausprd01.prod.outlook.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Currently, we split the write commands' data from the middle. If it does not
work, try to move the pivot left by one byte and retry until there is no
space.
But, this method has two flaws:
1. It may fail to trim all unnecessary bytes on the right side.
For example, there is an IO write command:
write addr uuxxxxuu
u is the unnecessary byte for the crash. Unlike ram write commands, in most
case, a split IO write won't trigger the same crash, So if we split from the
middle, we will get:
write addr uu (will be removed in next round)
write addr xxxxuu
For xxxxuu, since split it from the middle and retry to the leftmost byte
won't get the same crash, we will be stopped from removing the last two
bytes.
2. The algorithm complexity is O(n) since we move the pivot byte by byte.
To solve the first issue, we can try a symmetrical position on the right if
we fail on the left. As for the second issue, instead moving by one byte, we
can approach the boundary exponentially, achieving O(log(n)).
Give an example:
xxxxuu len=6
+
|
+
xxx,xuu 6/2=3 fail
+
+--------------+-------------+
| |
+ +
xx,xxuu 6/2^2=1 fail xxxxu,u 6-1=5 success
+ +
+------------------+----+ |
| | +-------------+ u removed
+ +
xx,xxu 5/2=2 fail xxxx,u 6-2=4 success
+
|
+-----------+ u removed
In some rare cases, this algorithm will fail to trim all unnecessary bytes:
xxxxxxxxxuxxxxxx
xxxxxxxx-xuxxxxxx Fail
xxxx-xxxxxuxxxxxx Fail
xxxxxxxxxuxx-xxxx Fail
...
I think the trade-off is worth it.
Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <SYCPR01MB3502D26F1BEB680CBBC169E5FCAB0@SYCPR01MB3502.ausprd01.prod.outlook.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Instead of removing IO instructions one by one, we can try deleting multiple
instructions at once. According to the locality of reference, we double the
number of instructions to remove for the next round and recover it to one
once we fail.
This patch is usually significant for large input.
Test with quadrupled trace input at:
https://bugs.launchpad.net/qemu/+bug/1890333/comments/1
Patched 1/6 version:
real 0m45.904s
user 0m16.874s
sys 0m10.042s
Refined version:
real 0m11.412s
user 0m6.888s
sys 0m3.325s
Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <SYCPR01MB350280A67BB55C3FADF173E3FCAB0@SYCPR01MB3502.ausprd01.prod.outlook.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We spend much time waiting for the timeout program during the minimization
process until it passes a time limit. This patch hacks the CLOSED (indicates
the redirection file closed) notification in QTest's output if it doesn't
crash.
Test with quadrupled trace input at:
https://bugs.launchpad.net/qemu/+bug/1890333/comments/1
Original version:
real 1m37.246s
user 0m13.069s
sys 0m8.399s
Refined version:
real 0m45.904s
user 0m16.874s
sys 0m10.042s
Note:
Sometimes the mutated or the same trace may trigger a different crash
summary (second-to-last line) but indicates the same bug. For example, Bug
1910826 [1], which will trigger a stack overflow, may output summaries
like:
SUMMARY: AddressSanitizer: stack-overflow
/home/qiuhao/hack/qemu/build/../softmmu/physmem.c:488 in
flatview_do_translate
or
SUMMARY: AddressSanitizer: stack-overflow
(/home/qiuhao/hack/qemu/build/qemu-system-i386+0x27ca049) in __asan_memcpy
Etc.
If we use the whole summary line as the token, we may be prevented from
further minimization. So in this patch, we only use the first three words
which indicate the type of crash:
SUMMARY: AddressSanitizer: stack-overflow
[1] https://bugs.launchpad.net/qemu/+bug/1910826
Signed-off-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <SYCPR01MB350251DC04003450348FAF68FCAB0@SYCPR01MB3502.ausprd01.prod.outlook.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Once we find a crash, we can convert it into a QTest trace. Usually this
trace will contain many operations that are unneeded to reproduce the
crash. This script tries to minimize the crashing trace, by removing
operations and trimming QTest bufwrite(write addr len data...) commands.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20201023150746.107063-12-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>