Add Power ISA 2.07/3.0 JIT optimizations for ppc64/ppc64le#367
Merged
Conversation
Use compile-time ISA level detection to emit optimized instructions when building on POWER8+ or POWER9+ hardware: ISA 2.07 (POWER8): - OP_CVIF: mtvsrwa+fcfids replaces extsw+std+lfd+fcfids (saves 2 insn, eliminates memory round-trip through stack scratch area) - OP_CVFI: xscvdpsxws+mfvsrwz replaces fctiwz+stfd+lwz (saves 1 insn, eliminates memory round-trip and BE/LE offset difference) ISA 3.0 (POWER9): - OP_MODI: modsw replaces divw+mullw+sub (saves 2 insn) - OP_MODU: moduw replaces divwu+mullw+sub (saves 2 insn) Detection uses _ARCH_PWR8/__POWER8_VECTOR__ and _ARCH_PWR9/ __POWER9_VECTOR__ compiler predefined macros. All optimizations gracefully fall back to the baseline instruction sequences when the target ISA level is not available.
Add a section explaining how to enable Power ISA 2.07 (POWER8) and ISA 3.0 (POWER9) JIT optimizations via -mcpu compiler flags.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
vm_powerpc.cfor optional instruction optimizations on POWER8+ and POWER9+ hardwareBUILD.mdISA 2.07 (POWER8) Optimizations
mtvsrwa+fcfidsreplacesextsw+std+lfd+fcfids(saves 2 insn, eliminates memory round-trip through stack scratch area)xscvdpsxws+mfvsrwzreplacesfctiwz+stfd+lwz(saves 1 insn, eliminates memory round-trip and BE/LE offset difference)ISA 3.0 (POWER9) Optimizations
modswreplacesdivw+mullw+sub(saves 2 insn)moduwreplacesdivwu+mullw+sub(saves 2 insn)Detection
Uses
_ARCH_PWR8/__POWER8_VECTOR__and_ARCH_PWR9/__POWER9_VECTOR__compiler predefined macros. All optimizations gracefully fall back to baseline instruction sequences when the target ISA level is not available.To enable:
make CFLAGS='-mcpu=power9' -j$(nproc)JIT Code Size (qagame.qvm)
-mcpu=power9)Testing