Skip to content

Add Power ISA 2.07/3.0 JIT optimizations for ppc64/ppc64le#367

Merged
ec- merged 2 commits intoec-:mainfrom
runlevel5:ppc64-isa3-opts
Feb 10, 2026
Merged

Add Power ISA 2.07/3.0 JIT optimizations for ppc64/ppc64le#367
ec- merged 2 commits intoec-:mainfrom
runlevel5:ppc64-isa3-opts

Conversation

@runlevel5
Copy link
Contributor

Summary

  • Add compile-time ISA level detection to vm_powerpc.c for optional instruction optimizations on POWER8+ and POWER9+ hardware
  • Document ppc64/ppc64le ISA optimization build flags in BUILD.md

ISA 2.07 (POWER8) Optimizations

  • OP_CVIF: mtvsrwa + fcfids replaces extsw + std + lfd + fcfids (saves 2 insn, eliminates memory round-trip through stack scratch area)
  • OP_CVFI: xscvdpsxws + mfvsrwz replaces fctiwz + stfd + lwz (saves 1 insn, eliminates memory round-trip and BE/LE offset difference)

ISA 3.0 (POWER9) Optimizations

  • OP_MODI: modsw replaces divw + mullw + sub (saves 2 insn)
  • OP_MODU: moduw replaces divwu + mullw + sub (saves 2 insn)

Detection

Uses _ARCH_PWR8/__POWER8_VECTOR__ and _ARCH_PWR9/__POWER9_VECTOR__ compiler predefined macros. All optimizations gracefully fall back to baseline instruction sequences when the target ISA level is not available.

To enable: make CFLAGS='-mcpu=power9' -j$(nproc)

JIT Code Size (qagame.qvm)

Platform Baseline Optimized Savings
ppc64le (Fedora 43, default=PWR8) 2,006,832 2,006,768 -64 bytes
ppc64 BE (Debian sid, -mcpu=power9) 2,010,432 2,006,944 -3,488 bytes

Testing

  • ppc64le (POWER9, Fedora 43): 120-second 64-bot stress test — clean
  • ppc64 BE (POWER9, Debian sid): 120-second 64-bot stress test — clean

Use compile-time ISA level detection to emit optimized instructions
when building on POWER8+ or POWER9+ hardware:

ISA 2.07 (POWER8):
- OP_CVIF: mtvsrwa+fcfids replaces extsw+std+lfd+fcfids (saves 2 insn,
  eliminates memory round-trip through stack scratch area)
- OP_CVFI: xscvdpsxws+mfvsrwz replaces fctiwz+stfd+lwz (saves 1 insn,
  eliminates memory round-trip and BE/LE offset difference)

ISA 3.0 (POWER9):
- OP_MODI: modsw replaces divw+mullw+sub (saves 2 insn)
- OP_MODU: moduw replaces divwu+mullw+sub (saves 2 insn)

Detection uses _ARCH_PWR8/__POWER8_VECTOR__ and _ARCH_PWR9/
__POWER9_VECTOR__ compiler predefined macros. All optimizations
gracefully fall back to the baseline instruction sequences when
the target ISA level is not available.
Add a section explaining how to enable Power ISA 2.07 (POWER8) and
ISA 3.0 (POWER9) JIT optimizations via -mcpu compiler flags.
@ec- ec- merged commit 1db8e9f into ec-:main Feb 10, 2026
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants