summary refs log tree commit diff stats
path: root/tcg/i386/tcg-target.h (follow)
Commit message (Collapse)AuthorAgeFilesLines
...
* tcg/i386: Support vector variable shift opcodesRichard Henderson2019-05-131-1/+1
| | | | Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/i386: Support INDEX_op_extract2_{i32,i64}Richard Henderson2019-04-241-2/+2
| | | | Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Add INDEX_op_extract2_{i32,i64}Richard Henderson2019-04-241-0/+2
| | | | | | | This will let backends implement the double-word shift operation. Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* cputlb: Remove static tlb sizingRichard Henderson2019-01-281-1/+0
| | | | | | | | Now that all tcg backends support TCG_TARGET_IMPLEMENTS_DYN_TLB, remove the define and the old code. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/i386: enable dynamic TLB sizingEmilio G. Cota2019-01-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the following experiments show, this series is a net perf gain, particularly for memory-heavy workloads. Experiments are run on an Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz. 1. System boot + shudown, debian aarch64: - Before (v3.1.0): Performance counter stats for './die.sh v3.1.0' (10 runs): 9019.797015 task-clock (msec) # 0.993 CPUs utilized ( +- 0.23% ) 29,910,312,379 cycles # 3.316 GHz ( +- 0.14% ) 54,699,252,014 instructions # 1.83 insn per cycle ( +- 0.08% ) 10,061,951,686 branches # 1115.541 M/sec ( +- 0.08% ) 172,966,530 branch-misses # 1.72% of all branches ( +- 0.07% ) 9.084039051 seconds time elapsed ( +- 0.23% ) - After: Performance counter stats for './die.sh tlb-dyn-v5' (10 runs): 8624.084842 task-clock (msec) # 0.993 CPUs utilized ( +- 0.23% ) 28,556,123,404 cycles # 3.311 GHz ( +- 0.13% ) 51,755,089,512 instructions # 1.81 insn per cycle ( +- 0.05% ) 9,526,513,946 branches # 1104.641 M/sec ( +- 0.05% ) 166,578,509 branch-misses # 1.75% of all branches ( +- 0.19% ) 8.680540350 seconds time elapsed ( +- 0.24% ) That is, a 4.4% perf increase. 2. System boot + shutdown, ubuntu 18.04 x86_64: - Before (v3.1.0): 56100.574751 task-clock (msec) # 1.016 CPUs utilized ( +- 4.81% ) 200,745,466,128 cycles # 3.578 GHz ( +- 5.24% ) 431,949,100,608 instructions # 2.15 insn per cycle ( +- 5.65% ) 77,502,383,330 branches # 1381.490 M/sec ( +- 6.18% ) 844,681,191 branch-misses # 1.09% of all branches ( +- 3.82% ) 55.221556378 seconds time elapsed ( +- 5.01% ) - After: 56603.419540 task-clock (msec) # 1.019 CPUs utilized ( +- 10.19% ) 202,217,930,479 cycles # 3.573 GHz ( +- 10.69% ) 439,336,291,626 instructions # 2.17 insn per cycle ( +- 14.14% ) 80,538,357,447 branches # 1422.853 M/sec ( +- 16.09% ) 776,321,622 branch-misses # 0.96% of all branches ( +- 3.77% ) 55.549661409 seconds time elapsed ( +- 10.44% ) No improvement (within noise range). Note that for this workload, increasing the time window too much can lead to perf degradation, since it flushes the TLB *very* frequently. 3. x86_64 SPEC06int: x86_64-softmmu speedup vs. v3.1.0 for SPEC06int (test set) Host: Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz (Skylake) 5.5 +------------------------------------------------------------------------+ | +-+ | 5 |-+.................+-+...............................tlb-dyn-v5.......+-| | * * | 4.5 |-+.................*.*................................................+-| | * * | 4 |-+.................*.*................................................+-| | * * | 3.5 |-+.................*.*................................................+-| | * * | 3 |-+......+-+*.......*.*................................................+-| | * * * * | 2.5 |-+......*..*.......*.*.................................+-+*...........+-| | * * * * * * | 2 |-+......*..*.......*.*.................................*..*...........+-| | * * * * * * +-+ | 1.5 |-+......*..*.......*.*.................................*..*.*+-+.*+-+.+-| | * * *+-+ * * +-+ *+-+ +-+ +-+ * * * * * * | 1 |++++-+*+*++*+*++*++*+*++*+*+++-+*+*+-++*+-++++-++++-+++*++*+*++*+*++*+++| | * * * * * * * * * * * * * * * * * * * * * * * * * * | 0.5 +------------------------------------------------------------------------+ 400.perlb401.bzip403.g429445.g456.hm462.libq464.h471.omn47483.xalancbgeomean png: https://imgur.com/YRF90f7 That is, a 1.51x average speedup over the baseline, with a max speedup of 5.17x. Here's a different look at the SPEC06int results, using KVM as the baseline: x86_64-softmmu slowdown vs. KVM for SPEC06int (test set) Host: Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz (Skylake) 25 +---------------------------------------------------------------------------+ | +-+ +-+ | | * * +-+ v3.1.0 | | * * +-+ tlb-dyn-v5 | | * * * * +-+ | 20 |-+.................*.*.............................*.+-+......*.*........+-| | * * * # # * * | | +-+ * * * # # * * | | * * * * * # # * * | 15 |-+......*.*........*.*.............................*.#.#......*.+-+......+-| | * * * * * # # * #|# | | * * * * +-+ * # # * +-+ | | * * +-+ * * ++-+ +-+ * # # * # # +-+ | | * * +-+ * * * ## *| +-+ * # # * # # +-+ | 10 |-+......*.*..*.+-+.*.*........*.##.......++-+.*.+-+*.#.#......*.#.#.*.*..+-| | * * * +-+ * * * ## +-+ *# # * # #* # # +-+ * # # * * | | * * * # # * * +-+ * ## * +-+ *# # * # #* # # * * * # # *+-+ | | * * * # # * * * +-+ * ## * # # *# # * # #* # # * * * # # * ## | 5 |-+......*.+-+*.#.#.*.*..*.#.#.*.##.*.#.#.*#.#.*.#.#*.#.#.*.*..*.#.#.*.##.+-| | * # #* # # * +-+* # # * ## * # # *# # * # #* # # * * * # # * ## | | * # #* # # * # #* # # * ## * # # *# # * # #* # # * +-+* # # * ## | | ++-+ * # #* # # * # #* # # * ## * # # *# # * # #* # # * # #* # # * ## | |+++*#+#+*+#+#*+#+#+*+#+#*+#+#+*+##+*+#+#+*#+#+*+#+#*+#+#+*+#+#*+#+#+*+##+++| 0 +---------------------------------------------------------------------------+ 400.perlbe401.bzi403.gc429445.go456.h462.libqu464.h471.omne4483.xalancbmgeomean png: https://imgur.com/YzAMNEV After this series, we bring down the average SPEC06int slowdown vs KVM from 11.47x to 7.58x. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20190116170114.26802-4-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: introduce dynamic TLB sizingEmilio G. Cota2019-01-281-0/+1
| | | | | | | | | | Disabled in all TCG backends for now. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20190116170114.26802-3-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/i386: Implement vector minmax arithmeticRichard Henderson2019-01-281-1/+1
| | | | | | | The avx instruction set does not directly provide MO_64. We can still implement 64-bit with comparison and vpblendvb. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/i386: Implement vector saturating arithmeticRichard Henderson2019-01-281-1/+1
| | | | | | | Only MO_8 and MO_16 are implemented, since that's all the instruction set provides. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Add opcodes for vector minmax arithmeticRichard Henderson2019-01-281-0/+1
| | | | Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Add opcodes for vector saturated arithmeticRichard Henderson2019-01-281-0/+1
| | | | Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg: Add TCG_TARGET_HAS_MEMORY_BSWAPRichard Henderson2018-12-171-0/+2
| | | | | | | | For now, defined universally as true, since we previously required backends to implement swapped memory operations. Future patches may now remove that support where it is onerous. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/i386: Implement INDEX_op_extr{lh}_i64_i32 for 32-bit guestsRichard Henderson2018-12-171-2/+3
| | | | | | | | This preserves the invariant that all TCG_TYPE_I32 values are zero-extended in the 64-bit host register. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/i386: Move TCG_REG_CALL_STACK from define to enumRichard Henderson2018-12-171-1/+1
| | | | | | Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/i386: Always use %ebp for TCG_AREG0Richard Henderson2018-12-171-6/+2
| | | | | | | | | | For x86_64, this can remove a REX prefix resulting in smaller code when manipulating globals of type i32, as we move them between backing store via cpu_env, aka TCG_AREG0. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/i386: Add vector operationsRichard Henderson2018-02-081-2/+39
| | | | | | | | | | | | | | | | | The x86 vector instruction set is extremely irregular. With newer editions, Intel has filled in some of the blanks. However, we don't get many 64-bit operations until SSE4.2, introduced in 2009. The subsequent edition was for AVX1, introduced in 2011, which added three-operand addressing, and adjusts how all instructions should be encoded. Given the relatively narrow 2 year window between possible to support and desirable to support, and to vastly simplify code maintainence, I am only planning to support AVX1 and later cpus. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* tcg/i386: Store out-of-range call targets in constant poolRichard Henderson2017-09-071-0/+1
| | | | | | | Already it saves 2 bytes per call, but also the constant pool entry may well be shared across multiple calls. Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Rearrange ldst label trackingRichard Henderson2017-09-071-0/+4
| | | | | | | | | | Dispense with TCGBackendData, as it has never been used for more than holding a single pointer. Use a define in the cpu/tcg-target.h to signal requirement for TCGLabelQemuLdst, so that we can drop the no-op tcg-be-null.h stubs. Rename tcg-be-ldst.h to tcg-ldst.inc.c. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Move USE_DIRECT_JUMP discriminator to tcg/cpu/tcg-target.hRichard Henderson2017-09-071-0/+9
| | | | | | | | | | | | | | | | | Replace the USE_DIRECT_JUMP ifdef with a TCG_TARGET_HAS_direct_jump boolean test. Replace the tb_set_jmp_target1 ifdef with an unconditional function tb_target_set_jmp_target. While we're touching all backends, add a parameter for tb->tc_ptr; we're going to need it shortly for some backends. Move tb_set_jmp_target and tb_add_jump from exec-all.h to cpu-exec.c. This opens the possibility for TCG_TARGET_HAS_direct_jump to be a runtime decision -- based on host cpu capabilities, the size of code_gen_buffer, or a future debugging switch. Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/i386: implement goto_ptrEmilio G. Cota2017-06-051-1/+1
| | | | | | | | | Suggested-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1493263764-18657-6-git-send-email-cota@braap.org> [rth: Reuse goto_ptr epilogue for exit_tb 0.] Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Introduce goto_ptr opcode and tcg_gen_lookup_and_goto_ptrEmilio G. Cota2017-06-051-0/+1
| | | | | | | | | | | | | | | | | | Instead of exporting goto_ptr directly to TCG frontends, export tcg_gen_lookup_and_goto_ptr(), which calls goto_ptr with the pointer returned by the lookup_tb_ptr() helper. This is the only use case we have for goto_ptr and lookup_tb_ptr, so having this function is very convenient. Furthermore, it trivially allows us to avoid calling the lookup helper if goto_ptr is not implemented by the backend. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <1493263764-18657-2-git-send-email-cota@braap.org> Message-Id: <1493263764-18657-3-git-send-email-cota@braap.org> Message-Id: <1493263764-18657-4-git-send-email-cota@braap.org> Message-Id: <1493263764-18657-5-git-send-email-cota@braap.org> [rth: Squashed 4 related commits.] Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: enable MTTCG by default for ARM on x86 hostsAlex Bennée2017-02-241-0/+11
| | | | | | | | | | | | | | | | | | | | | | | This enables the multi-threaded system emulation by default for ARMv7 and ARMv8 guests using the x86_64 TCG backend. This is because on the guest side: - The ARM translate.c/translate-64.c have been converted to - use MTTCG safe atomic primitives - emit the appropriate barrier ops - The ARM machine has been updated to - hold the BQL when modifying shared cross-vCPU state - defer powerctl changes to async safe work All the host backends support the barrier and atomic primitives but need to provide same-or-better support for normal load/store operations. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Acked-by: Peter Maydell <peter.maydell@linaro.org> Tested-by: Pranith Kumar <bobby.prani@gmail.com> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
* tcg/i386: Handle ctpop opcodeRichard Henderson2017-01-101-2/+3
| | | | Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Add opcode for ctpopRichard Henderson2017-01-101-0/+2
| | | | | | | | | The number of actual invocations of ctpop itself does not warrent an opcode, but it is very helpful for POWER7 to use in generating an expansion for ctz. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/i386: Handle ctz and clz opcodesRichard Henderson2017-01-101-4/+4
| | | | Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Add clz and ctz opcodesRichard Henderson2017-01-101-0/+4
| | | | | Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/i386: Implement field extraction opcodesRichard Henderson2017-01-101-3/+9
| | | | Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Add field extraction primitivesRichard Henderson2017-01-101-0/+4
| | | | | | | | Adds tcg_gen_extract_* and tcg_gen_sextract_* for extraction of fixed position bitfields, much like we already have for deposit. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Clean up tcg-target.h header guardsMarkus Armbruster2016-07-121-2/+3
| | | | | | | | | | | | | These use guard symbols like TCG_TARGET_$target. scripts/clean-header-guards.pl doesn't like them because they don't match their file name (they should, to make guard collisions less likely). Clean them up: use guard symbol $target_TCG_TARGET_H for tcg/$target/tcg-target.h. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net>
* tcg: Split trunc_shr_i32 opcode into extr[lh]_i64_i32Richard Henderson2015-08-241-1/+2
| | | | | | | Rather than allow arbitrary shift+trunc, only concern ourselves with low and high parts. This is all that was being used anyway. Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: rename trunc_shr_i32 into trunc_shr_i64_i32Aurelien Jarno2015-08-241-1/+1
| | | | | | | | | | | The op is sometimes named trunc_shr_i32 and sometimes trunc_shr_i64_i32, and the name in the README doesn't match the name offered to the frontends. Always use the long name to make it clear it is a size changing op. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: add TCG_TARGET_TLB_DISPLACEMENT_BITSPaolo Bonzini2015-06-031-0/+1
| | | | | | | | | | | | This will be used to size the TLB when more than 8 MMU modes are used by the target. Limitations come from the limited size of the immediate fields (which sometimes, as in the case of Aarch64, extend to instructions that shift the immediate). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <1424436345-37924-2-git-send-email-pbonzini@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Alexander Graf <agraf@suse.de>
* tcg: Remove TCG_TARGET_HAS_new_ldstRichard Henderson2014-06-041-2/+0
| | | | | | | Since all backends have been converted, remove the compatibility code. Acked-by: Claudio Fontana <claudio.fontana@huawei.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg-i386: Define TCG_TARGET_INSN_UNIT_SIZERichard Henderson2014-05-121-0/+2
| | | | | | | | And use tcg pointer differencing functions as appropriate. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Add INDEX_op_trunc_shr_i32Richard Henderson2014-04-281-0/+1
| | | | | | Let the backend do something special for truncation. Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Use HOST_WORDS_BIGENDIANRichard Henderson2014-04-181-2/+0
| | | | | | Instead of rolling a local TCG_TARGET_WORDS_BIGENDIAN. Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/i386: Use ANDN instructionRichard Henderson2014-02-171-2/+4
| | | | | | | | Note that the optimizer cannot simplify ANDC X,Y,C to AND X,Y,~C so we must handle constants in the implementation of andc. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg/i386: Move TCG_CT_CONST_* to tcg-target.cRichard Henderson2014-02-171-3/+0
| | | | | | | | | These are not needed by users of tcg-target.h. No need to recompile when we adjust them. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg-i386: Support new ldst opcodesRichard Henderson2013-10-121-1/+1
| | | | | | | No support for helpers with non-default endianness yet, but good enough to test the opcodes. Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Add qemu_ld_st_i32/64Richard Henderson2013-10-101-0/+2
| | | | | | | Step two in the transition, adding the new ldst opcodes. Keep the old opcodes around until all backends support the new opcodes. Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Allow TCG_TARGET_REG_BITS to be specified independantlyRichard Henderson2013-09-021-4/+6
| | | | | | | | There are several hosts for which it would be useful to use the available 64-bit registers in a 32-bit pointer environment. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Change flush_icache_range arguments to uintptr_tRichard Henderson2013-09-021-2/+1
| | | | | Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg: Add muluh and mulsh opcodesRichard Henderson2013-09-021-0/+4
| | | | | | | | Use them in places where mulu2 and muls2 are used. Optimize mulx2 with dead low part to mulxh. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>
* tcg-i386: Implement multiword arithmetic opsRichard Henderson2013-02-231-5/+5
| | | | | Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* tcg: Add signed multiword multiplication operationsRichard Henderson2013-02-231-0/+2
| | | | | Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* tcg: Add 64-bit multiword arithmetic operationsRichard Henderson2013-02-231-0/+3
| | | | | | Matching the 32-bit multiword arithmetic that we already have. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* tcg-i386: Always implement 32-bit multiword opsRichard Henderson2013-02-231-4/+3
| | | | | Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* tcg: Make 32-bit multiword operations optional for 64-bit hostsRichard Henderson2013-02-231-0/+4
| | | | | Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* tcg-i386: Perform cmov detection at runtime for 32-bit.Richard Henderson2012-12-291-5/+0
| | | | | | | | Existing compile-time detection is spotty at best. Convert it all to runtime detection instead. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* janitor: add guards to headersPaolo Bonzini2012-12-191-0/+3
| | | | Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* tcg: Remove TCG_TARGET_HAS_GUEST_BASE definePeter Maydell2012-10-121-2/+0
| | | | | | | | | | GUEST_BASE support is now supported by all TCG backends, and is now mandatory. Drop the now-pointless TCG_TARGET_HAS_GUEST_BASE define (set by every backend) and the error if it is unset. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Riku Voipio <riku.voipio@linaro.org>