Skip to content

Possible Metal backend regression on Intel macOS: v1.16.4 plays grossly incorrectly with b18 weights, while OpenCL behaves normally #1175

@xiangz19

Description

@xiangz19

Description

I may have found a Metal-backend-specific regression on Intel macOS.

Initially I was not sure whether this was caused by the Homebrew build/packaging specifically, or by KataGo itself on this backend/platform combination. However, after rebuilding the same Homebrew formula from source and reproducing the same issue, I now suspect a KataGo backend/runtime issue much more than a Homebrew packaging issue.

On the same Intel Mac, using the same model and broadly the same config:

  • a Homebrew-installed KataGo v1.16.4 binary plays grossly incorrectly
  • rebuilding the Homebrew formula from source still shows the same issue
  • an OpenCL KataGo binary bundled with KaTrain behaves normally
  • the OpenCL binary is also about 3x faster than the failing binary on this machine

This makes me suspect a Metal backend issue on Intel Mac, rather than a GUI integration problem.

At the same time, this does not seem to affect every model equally on the same build. The problem appears most severe with the b18 model, while smaller nets such as b6 and other tested weights such as b40c256 appear much more normal with the same Homebrew binary.

What I observed

With the failing binary, I saw behavior such as:

  • very low-level blunders
  • a move gets captured for no good reason
  • entire games where the bot appeared far weaker than expected
  • this happened with a strong b18 net in positions where I would not expect such mistakes even at modest visits

I tested with multiple GUI programs, including Sabaki, and saw the same kind of bad behavior.

Why I think this is not just randomness / low visits

I spent time checking this possibility first.

I confirmed separately that low visits can cause large instability in general, but this case looks different because:

  • the bad behavior was much more extreme on this macOS binary
  • rebuilding the Homebrew formula from source did not change the behavior
  • the same b18 model behaved normally with a different KataGo binary on the same machine
  • smaller or other tested weights such as b6 and b40c256 appeared much more normal with the same Homebrew binary
  • replacing the failing binary with another KataGo binary on the same Mac restored normal play

Comparison

Failing setup

  • KataGo binary: Homebrew-installed v1.16.4
    katago version reports below

    katago version
    KataGo v1.16.4
    Git revision: <omitted>
    Compile Time: Oct 20 2025 13:40:11
    Using Metal backend
    
  • Machine: Intel Macbook Pro, 13-inch, 2020, Four Thunderbolt 3 ports

  • OS: macOS 15.7.1

  • Backend: Metal, as reported by the katago version output above.
    When rebuilding from the Homebrew formula, I also saw the following build command:

    cmake -S cpp -B build -DNO_GIT_REVISION=1 -DUSE_BACKEND=METAL -GNinja
  • Model: kata1-b18c384nbt-s9996604416-d4316597426.bin.gz

  • Config: custom gtp_example.cfg derived from KataGo example config

  • I also rebuilt the Homebrew formula from source locally and saw the same issue.
    The Homebrew formula appears to be a very thin build recipe using upstream v1.16.4 source and -DUSE_BACKEND=METAL, so at this point a pure Homebrew packaging issue seems less likely than a backend/runtime issue.

Working setup

  • KataGo binary: OpenCL binary bundled with KaTrain
  • Same machine
  • Same model
  • Same general usage / same kind of positions
  • Performance returned to normal
  • Speed was also about 3x faster than the failing binary

Additional notes

  • This does not look like a GUI-specific issue, because I reproduced the problem across multiple GUI programs, including Sabaki.
  • On the same Homebrew binary, other tested weights such as b6c96 and b40c256 appeared much more normal than the failing b18 weight.
  • This does not look like a generic “Intel Mac is too slow” issue, because the OpenCL binary on the same machine is both faster and plays correctly.

Config in use

The active settings in my config included roughly:

  • rules = tromp-taylor
  • allowResignation = false
  • dynamicPlayoutDoublingAdvantageCapPerOppLead = 0.045
  • maxVisits = 1600 (context, I can only get 100-200 visits within 30 seconds, I also test with fixed 50 visits)
  • maxTime = 30
  • ponderingEnabled = false
  • lagBuffer = 1.0
  • numSearchThreads = 4
  • searchFactorAfterOnePass = 0.50
  • searchFactorAfterTwoPass = 0.25
  • searchFactorWhenWinning = 0.40
  • searchFactorWhenWinningThreshold = 0.95

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions