-
Notifications
You must be signed in to change notification settings - Fork 676
Description
Description
I may have found a Metal-backend-specific regression on Intel macOS.
Initially I was not sure whether this was caused by the Homebrew build/packaging specifically, or by KataGo itself on this backend/platform combination. However, after rebuilding the same Homebrew formula from source and reproducing the same issue, I now suspect a KataGo backend/runtime issue much more than a Homebrew packaging issue.
On the same Intel Mac, using the same model and broadly the same config:
- a Homebrew-installed KataGo
v1.16.4binary plays grossly incorrectly - rebuilding the Homebrew formula from source still shows the same issue
- an OpenCL KataGo binary bundled with KaTrain behaves normally
- the OpenCL binary is also about 3x faster than the failing binary on this machine
This makes me suspect a Metal backend issue on Intel Mac, rather than a GUI integration problem.
At the same time, this does not seem to affect every model equally on the same build. The problem appears most severe with the b18 model, while smaller nets such as b6 and other tested weights such as b40c256 appear much more normal with the same Homebrew binary.
What I observed
With the failing binary, I saw behavior such as:
- very low-level blunders
- a move gets captured for no good reason
- entire games where the bot appeared far weaker than expected
- this happened with a strong
b18net in positions where I would not expect such mistakes even at modest visits
I tested with multiple GUI programs, including Sabaki, and saw the same kind of bad behavior.
Why I think this is not just randomness / low visits
I spent time checking this possibility first.
I confirmed separately that low visits can cause large instability in general, but this case looks different because:
- the bad behavior was much more extreme on this macOS binary
- rebuilding the Homebrew formula from source did not change the behavior
- the same
b18model behaved normally with a different KataGo binary on the same machine - smaller or other tested weights such as
b6andb40c256appeared much more normal with the same Homebrew binary - replacing the failing binary with another KataGo binary on the same Mac restored normal play
Comparison
Failing setup
-
KataGo binary: Homebrew-installed
v1.16.4
katago version reports belowkatago version KataGo v1.16.4 Git revision: <omitted> Compile Time: Oct 20 2025 13:40:11 Using Metal backend -
Machine: Intel Macbook Pro, 13-inch, 2020, Four Thunderbolt 3 ports
-
OS: macOS
15.7.1 -
Backend: Metal, as reported by the
katago versionoutput above.
When rebuilding from the Homebrew formula, I also saw the following build command:cmake -S cpp -B build -DNO_GIT_REVISION=1 -DUSE_BACKEND=METAL -GNinja
-
Model: kata1-b18c384nbt-s9996604416-d4316597426.bin.gz
-
Config: custom
gtp_example.cfgderived from KataGo example config -
I also rebuilt the Homebrew formula from source locally and saw the same issue.
The Homebrew formula appears to be a very thin build recipe using upstreamv1.16.4source and-DUSE_BACKEND=METAL, so at this point a pure Homebrew packaging issue seems less likely than a backend/runtime issue.
Working setup
- KataGo binary: OpenCL binary bundled with KaTrain
- Same machine
- Same model
- Same general usage / same kind of positions
- Performance returned to normal
- Speed was also about 3x faster than the failing binary
Additional notes
- This does not look like a GUI-specific issue, because I reproduced the problem across multiple GUI programs, including Sabaki.
- On the same Homebrew binary, other tested weights such as
b6c96andb40c256appeared much more normal than the failingb18weight. - This does not look like a generic “Intel Mac is too slow” issue, because the OpenCL binary on the same machine is both faster and plays correctly.
Config in use
The active settings in my config included roughly:
rules = tromp-taylorallowResignation = falsedynamicPlayoutDoublingAdvantageCapPerOppLead = 0.045maxVisits = 1600(context, I can only get 100-200 visits within 30 seconds, I also test with fixed 50 visits)maxTime = 30ponderingEnabled = falselagBuffer = 1.0numSearchThreads = 4searchFactorAfterOnePass = 0.50searchFactorAfterTwoPass = 0.25searchFactorWhenWinning = 0.40searchFactorWhenWinningThreshold = 0.95