Skip to content

[Intel NPU] Add Windows & Linux Intel NPU support#1171

Open
Looong01 wants to merge 29 commits intolightvector:masterfrom
Looong01:Intel_NPU
Open

[Intel NPU] Add Windows & Linux Intel NPU support#1171
Looong01 wants to merge 29 commits intolightvector:masterfrom
Looong01:Intel_NPU

Conversation

@Looong01
Copy link

@Looong01 Looong01 commented Mar 16, 2026

Summary

This PR adds and hardens the Windows & Linux Intel NPU path for KataGo using the ONNX backend with ONNX Runtime + OpenVINO Execution Provider, and updates docs/config guidance for an end-to-end workflow.

It also improves failure behavior for non-ONNX builds and simplifies Windows & Linux dependency handling.

What Changed

1) ONNX backend and OpenVINO provider support

  • Added/updated ONNX Runtime provider selection via onnxProvider (cpu, openvino, cuda, tensorrt, migraphx, coreml).
  • Added/updated OpenVINO-specific runtime options:
    • onnxOpenVINODeviceType
    • onnxOpenVINODeviceId
    • onnxOpenVINOCacheDir
    • onnxOpenVINOEnableNPUFastCompile (best-effort; depends on ORT build support)
  • Supports both:
    • loading raw .onnx models directly
    • loading .bin/.bin.gz models via internal conversion to ONNX graph

2) exportonnx command behavior

  • exportonnx is available in ONNX builds and exports fixed-size ONNX models.
  • Default export board size is 19x19 (-x/-y can override).
  • In non-ONNX builds, exportonnx now returns a clear error instead of failing ambiguously.

3) Config safety for non-ONNX binaries

  • In non-ONNX builds, forcing onnx* config keys now fails fast with a clear message.
  • Prevents silent misconfiguration when users accidentally pass ONNX-only config into CUDA/OpenCL/Eigen/etc builds.

4) CMake dependency flow

  • Kept ONNX runtime root wiring via ONNXRUNTIME_ROOT (defaulting to cpp/external/onnxruntime-win-x64-openvino and cpp/external/onnxruntime-linux-x64-openvino).
  • Added/updated automatic dependency fetch flow for Windows & Linux builds (zlib, onnx, protobuf) through vcpkg when enabled.
  • ONNX runtime DLLs or SOs are copied to output dir during build on Windows or Linux.

5) Documentation updates

  • Compiling.md:
    • Added explicit Windows & Linux Intel NPU setup steps:
      • Visual Studio Community or VS 2026 Build Tools (Desktop C++)
      • Intel NPU driver install
      • OpenVINO archive install
      • ONNX Runtime build with OpenVINO EP (use_openvino=NPU)
    • Added the exact file-copy checklist into cpp/external/onnxruntime-win-x64-openvino.
    • Added minimal ONNX backend build command.
  • README.md:
    • Added Intel NPU quick-start section for ONNX/OpenVINO.
    • Added minimal commands for:
      • exportonnx (default 19x19)
      • benchmark
      • gtp

Behavior Notes

  • Multi-device mapping (onnxDeviceToUseThread*) is mainly intended for ONNX providers like CUDA/TensorRT/MIGraphX.
  • OpenVINO Intel NPU usage is typically single-device.

Validation

  • ONNX build compiles successfully on Windows & Linuix.
  • exportonnx works from .bin/.bin.gz -> .onnx.
  • benchmark/gtp run with onnxProvider=openvino and onnxOpenVINODeviceType=NPU.
  • Non-ONNX binaries now correctly reject ONNX-only config keys.

@Looong01
Copy link
Author

Looong01 commented Mar 16, 2026

This is screenshot of Sabaki testing:
屏幕截图 2026-03-16 184650

And the binary release here: https://github.com/Looong01/KataGo-Multi-backends/releases/tag/v1.16.4-openvino

@Looong01
Copy link
Author

Looong01 commented Mar 16, 2026

I partially referenced the code from #1164, and I am very grateful to @ChinChangYang

@Looong01 Looong01 changed the title [Intel NPU] Add Windows Intel NPU support [Intel NPU] Add Windows & Linux Intel NPU support Mar 16, 2026
@Looong01
Copy link
Author

Add Linux support:
screenshot

@foxrainowo
Copy link

This is a wonderful work! I will test this backend in a few days.

@Looong01
Copy link
Author

Looong01 commented Mar 18, 2026

Actually, this PR includes AMD GPU ROCm backends. Also all the tests and benchmark passed.

I also try to use Openvino to implement Intel NPU backends directly, but that performance worse than Onnxruntime+Openvino solution. And it is not stable. So I just push the Codes of Onnxruntime+OpenVINO

@Looong01
Copy link
Author

I will implement AMD NPU backend in days.

@foxrainowo
Copy link

foxrainowo commented Mar 20, 2026

@Looong01

I conducted some tests on my device with no issues, successfully calling the Intel NPU:
Using b28c512nbt, the GPU speed was 18–23 visits/s, and the NPU speed was 55–70 visits/s. That’s 2.7 to 3 times faster.
For multi-network matches, the speed reached 1.8 times the original.

I’ve come to a preliminary conclusion: the NPU backend should not be configured with multi-threading. Its initialization time depends on the number of threads set—the more threads, the longer the wait. On the other hand, multi-threading actually slows down the computation speed. For single-game analysis, I use a single thread because it is the fastest and offers the best quality (as shown in the figure below, the speed of thread 12 is very slow beacuse of the initialization). For multi-network matches, I set it to “run two games simultaneously” because running too many games at once slows down the speed and reduces performance.

  1. Do you expect this backend to affect accuracy, or are there any comparative tests on this?

  2. During initialization, it generates many blob files. What are these blob files?

  3. What is the function of these parameters? Can they be automated, and is it necessary for users to modify them?
    onnxInputSpatial = input_spatial
    onnxInputGlobal = input_global
    onnxInputMeta = input_meta
    onnxOutputPolicy = out_policy
    onnxOutputValue = out_value
    onnxOutputMiscvalue = out_miscvalue
    onnxOutputOwnership = out_ownership
    onnxModelVersion = 15

NPU_benchmark NPU_match

@Looong01
Copy link
Author

@Looong01

I conducted some tests on my device with no issues, successfully calling the Intel NPU: Using b28c512nbt, the GPU speed was 18–23 visits/s, and the NPU speed was 55–70 visits/s. That’s 2.7 to 3 times faster. For multi-network matches, the speed reached 1.5 to 1.8 times the original.

I’ve come to a preliminary conclusion: the NPU backend should not be configured with multi-threading. Its initialization time depends on the number of threads set—the more threads, the longer the wait. On the other hand, multi-threading actually slows down the computation speed. For single-game analysis, I use a single thread because it is the fastest and offers the best quality (as shown in the figure below, the speed of thread 12 is very slow beacuse of the initialization). For multi-network matches, I set it to “run two games simultaneously” because running too many games at once slows down the speed and reduces performance.

  1. Do you expect this backend to affect accuracy, or are there any comparative tests on this?
  2. During initialization, it generates many blob files. What are these blob files?
  3. What is the function of these parameters? Can they be automated, and is it necessary for users to modify them?
    onnxInputSpatial = input_spatial
    onnxInputGlobal = input_global
    onnxInputMeta = input_meta
    onnxOutputPolicy = out_policy
    onnxOutputValue = out_value
    onnxOutputMiscvalue = out_miscvalue
    onnxOutputOwnership = out_ownership
    onnxModelVersion = 15
NPU_benchmark

Thank u for your test.

  1. No. I do lots of tests and this backend will NOT affect accuracy.
  2. Blob files are the compiling cache of NPU. Because the model need to be compiled for the first time if you want to use NPU. It just like any model running on NPU. And it also just like TensorRT backend and generate some cache files.
  3. These are some underlying engine configuration parameters. Users will not use it in general. But it is useful to do debugging.

@foxrainowo
Copy link

Thank you!

I am concerned about the poor performance of multi-threading. As shown in the figure, when the number of threads increases, the computation speed actually decreases. Is this because the NPU itself is not suitable for multi-threading, or is it still possible to optimize multi-threading at this stage?

@kaorahi
Copy link
Contributor

kaorahi commented Mar 21, 2026

This is amazing on my Linux notebook. I am seeing a 3.5x speedup (87.30 vs 25.16 visits/s) compared to OpenCL, which seems unusually slow on my system. I really appreciate this. As for katago benchmark, it recommends numSearchThreads = 1 in my case as well.

To build ONNX Runtime, I had to downgrade gcc-15 to gcc-14.

CC=gcc-14 CXX=g++-14 CMAKE_PREFIX_PATH=/usr/lib/cmake/openvino2026.0.0 ./build.sh --config Release --use_openvino NPU --build_shared_lib --skip_tests

Also, the source directories seem different from the document, so I used the following commands in zsh.

cd ~/katago/
mkdir -p cpp/external/onnxruntime-linux-x64-openvino/{include,lib/{cmake/onnxruntime,pkgconfig}}
cd cpp/external/onnxruntime-linux-x64-openvino
cp -r ~/onnxruntime/include/onnxruntime/core include/
cp ~/onnxruntime/include/onnxruntime/**/{cpu_provider_factory.h,provider_options.h,onnxruntime_c_api.h,onnxruntime_cxx_api.h,onnxruntime_cxx_inline.h,onnxruntime_env_config_keys.h,onnxruntime_ep_c_api.h,onnxruntime_ep_device_ep_metadata_keys.h,onnxruntime_float16.h,onnxruntime_lite_custom_op.h,onnxruntime_run_options_config_keys.h,onnxruntime_session_options_config_keys.h} include/
cp ~/onnxruntime/build/Linux/Release/**/{libonnxruntime_providers_openvino.so,libonnxruntime_providers_shared.so,libonnxruntime.so.1.*,libonnxruntime.so.1,libonnxruntime.so} lib/
cp ~/onnxruntime/build/Linux/Release/**/{onnxruntimeConfig.cmake,onnxruntimeConfigVersion.cmake,onnxruntimeTargets.cmake,onnxruntimeTargets-release.cmake} lib/cmake/onnxruntime/
cp ~/onnxruntime/build/Linux/Release/**/libonnxruntime.pc lib/pkgconfig/

@ChinChangYang
Copy link
Contributor

Claude detects an issues in a Docker container.

Bug: onnxmodelbuilder.cpp fails to compile on Linux/GCC — ONNX_API macro undefined

Error message:

/path/to/_deps/onnx-build/onnx/onnx-ml.pb.h:42:17:
error: variable 'ONNX_API TableStruct_onnx_2fonnx_2dml_2eproto' has initializer but incomplete type
   42 | struct ONNX_API TableStruct_onnx_2fonnx_2dml_2eproto {
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

/path/to/_deps/onnx-build/onnx/onnx-ml.pb.h:43:3:
error: expected primary-expression before 'static'
   43 |   static const uint32_t offsets[];

/path/to/_deps/onnx-build/onnx/onnx-ml.pb.h:48:51:
error: expected initializer before '_AttributeProto_default_instance_'
   48 | ONNX_API extern AttributeProtoDefaultTypeInternal _AttributeProto_default_instance_;

Root cause:

onnxmodelbuilder.cpp includes <onnx/onnx-ml.pb.h> directly, bypassing onnx/onnx_pb.h which defines the ONNX_API macro. When ONNX_API is undefined, the compiler treats it as an identifier rather than an attribute specifier, breaking the struct/extern declarations in the generated protobuf header.


Reproduction steps:

# 1. Clone and checkout this PR branch
git clone https://github.com/lightvector/KataGo.git
cd KataGo
git fetch origin pull/1171/head:pr-1171
git checkout pr-1171

# 2. Download ORT prebuilt + build onnx_proto and protobuf-lite from source
#    (ONNXRUNTIME_ROOT = prebuilt ORT package dir)
#    (ONNX_INCLUDE_DIR = ort-build/_deps/onnx-build)
#    (ONNX_PROTO_LIB   = ort-build/_deps/onnx-build/libonnx_proto.a)
#    (PROTOBUF_INCLUDE_DIR = ort-build/_deps/protobuf-src/src)
#    (PROTOBUF_LIB     = ort-build/_deps/protobuf-build/libprotobuf-lite.a)

# 3. Configure
mkdir build && cd build
cmake ../cpp \
  -DUSE_BACKEND=ONNX \
  -DKATAGO_AUTO_FETCH_DEPS=OFF \
  -DONNXRUNTIME_ROOT=<ort-prebuilt-dir> \
  -DONNX_INCLUDE_DIR=<ort-build>/_deps/onnx-build \
  -DONNX_PROTO_LIB=<ort-build>/_deps/onnx-build/libonnx_proto.a \
  -DPROTOBUF_INCLUDE_DIR=<ort-build>/_deps/protobuf-src/src \
  -DPROTOBUF_LIB=<ort-build>/_deps/protobuf-build/libprotobuf-lite.a \
  -DCMAKE_CXX_FLAGS="-DONNX_ML"

# 4. Build → fails at onnxmodelbuilder.cpp
cmake --build . -j$(nproc)

System environment:

Item Value
OS Linux aarch64
Compiler GCC 15.2.0
ONNX Runtime v1.21.0
protobuf 3.21.12 (ORT bundled)
cmake 4.2.3

Fix:

In cpp/neuralnet/onnxmodelbuilder.cpp, change line 9:

-#include <onnx/onnx-ml.pb.h>
+#include <onnx/onnx_pb.h>

onnx_pb.h defines ONNX_API before including onnx-ml.pb.h, resolving the macro issue. Note: when using the new ONNX_INCLUDE_DIR cmake variable, onnx_pb.h must also be present in that directory (it lives in the ONNX source tree, not the build output). See also: ChinChangYang/KataGo#18.

@Looong01
Copy link
Author

Looong01 commented Mar 25, 2026

Claude detects an issues in a Docker container.

Bug: onnxmodelbuilder.cpp fails to compile on Linux/GCC — ONNX_API macro undefined

Error message:

/path/to/_deps/onnx-build/onnx/onnx-ml.pb.h:42:17:
error: variable 'ONNX_API TableStruct_onnx_2fonnx_2dml_2eproto' has initializer but incomplete type
   42 | struct ONNX_API TableStruct_onnx_2fonnx_2dml_2eproto {
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

/path/to/_deps/onnx-build/onnx/onnx-ml.pb.h:43:3:
error: expected primary-expression before 'static'
   43 |   static const uint32_t offsets[];

/path/to/_deps/onnx-build/onnx/onnx-ml.pb.h:48:51:
error: expected initializer before '_AttributeProto_default_instance_'
   48 | ONNX_API extern AttributeProtoDefaultTypeInternal _AttributeProto_default_instance_;

Root cause:

onnxmodelbuilder.cpp includes <onnx/onnx-ml.pb.h> directly, bypassing onnx/onnx_pb.h which defines the ONNX_API macro. When ONNX_API is undefined, the compiler treats it as an identifier rather than an attribute specifier, breaking the struct/extern declarations in the generated protobuf header.

Reproduction steps:

# 1. Clone and checkout this PR branch
git clone https://github.com/lightvector/KataGo.git
cd KataGo
git fetch origin pull/1171/head:pr-1171
git checkout pr-1171

# 2. Download ORT prebuilt + build onnx_proto and protobuf-lite from source
#    (ONNXRUNTIME_ROOT = prebuilt ORT package dir)
#    (ONNX_INCLUDE_DIR = ort-build/_deps/onnx-build)
#    (ONNX_PROTO_LIB   = ort-build/_deps/onnx-build/libonnx_proto.a)
#    (PROTOBUF_INCLUDE_DIR = ort-build/_deps/protobuf-src/src)
#    (PROTOBUF_LIB     = ort-build/_deps/protobuf-build/libprotobuf-lite.a)

# 3. Configure
mkdir build && cd build
cmake ../cpp \
  -DUSE_BACKEND=ONNX \
  -DKATAGO_AUTO_FETCH_DEPS=OFF \
  -DONNXRUNTIME_ROOT=<ort-prebuilt-dir> \
  -DONNX_INCLUDE_DIR=<ort-build>/_deps/onnx-build \
  -DONNX_PROTO_LIB=<ort-build>/_deps/onnx-build/libonnx_proto.a \
  -DPROTOBUF_INCLUDE_DIR=<ort-build>/_deps/protobuf-src/src \
  -DPROTOBUF_LIB=<ort-build>/_deps/protobuf-build/libprotobuf-lite.a \
  -DCMAKE_CXX_FLAGS="-DONNX_ML"

# 4. Build → fails at onnxmodelbuilder.cpp
cmake --build . -j$(nproc)

System environment:

Item Value
OS Linux aarch64
Compiler GCC 15.2.0
ONNX Runtime v1.21.0
protobuf 3.21.12 (ORT bundled)
cmake 4.2.3
Fix:

In cpp/neuralnet/onnxmodelbuilder.cpp, change line 9:

-#include <onnx/onnx-ml.pb.h>
+#include <onnx/onnx_pb.h>

onnx_pb.h defines ONNX_API before including onnx-ml.pb.h, resolving the macro issue. Note: when using the new ONNX_INCLUDE_DIR cmake variable, onnx_pb.h must also be present in that directory (it lives in the ONNX source tree, not the build output). See also: ChinChangYang/KataGo#18.

Actually, my CMakeLists.txt deal with it well. I use vcpkg to deal with this deps.

"https://github.com/Looong01/KataGo-Multi-backends/blob/115e6daba5f8063fd70d7c89631f123cccced902/cpp/CMakeLists.txt".

Or, do u still think I need to do this change?

@Looong01
Copy link
Author

Thank you!

I am concerned about the poor performance of multi-threading. As shown in the figure, when the number of threads increases, the computation speed actually decreases. Is this because the NPU itself is not suitable for multi-threading, or is it still possible to optimize multi-threading at this stage?

Bcs NPU is different arch(totally different from GPU or CPU), single threading is enough for it.

@ChinChangYang
Copy link
Contributor

Claude detects an issues in a Docker container.
Bug: onnxmodelbuilder.cpp fails to compile on Linux/GCC — ONNX_API macro undefined
Error message:

/path/to/_deps/onnx-build/onnx/onnx-ml.pb.h:42:17:
error: variable 'ONNX_API TableStruct_onnx_2fonnx_2dml_2eproto' has initializer but incomplete type
   42 | struct ONNX_API TableStruct_onnx_2fonnx_2dml_2eproto {
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

/path/to/_deps/onnx-build/onnx/onnx-ml.pb.h:43:3:
error: expected primary-expression before 'static'
   43 |   static const uint32_t offsets[];

/path/to/_deps/onnx-build/onnx/onnx-ml.pb.h:48:51:
error: expected initializer before '_AttributeProto_default_instance_'
   48 | ONNX_API extern AttributeProtoDefaultTypeInternal _AttributeProto_default_instance_;

Root cause:
onnxmodelbuilder.cpp includes <onnx/onnx-ml.pb.h> directly, bypassing onnx/onnx_pb.h which defines the ONNX_API macro. When ONNX_API is undefined, the compiler treats it as an identifier rather than an attribute specifier, breaking the struct/extern declarations in the generated protobuf header.
Reproduction steps:

# 1. Clone and checkout this PR branch
git clone https://github.com/lightvector/KataGo.git
cd KataGo
git fetch origin pull/1171/head:pr-1171
git checkout pr-1171

# 2. Download ORT prebuilt + build onnx_proto and protobuf-lite from source
#    (ONNXRUNTIME_ROOT = prebuilt ORT package dir)
#    (ONNX_INCLUDE_DIR = ort-build/_deps/onnx-build)
#    (ONNX_PROTO_LIB   = ort-build/_deps/onnx-build/libonnx_proto.a)
#    (PROTOBUF_INCLUDE_DIR = ort-build/_deps/protobuf-src/src)
#    (PROTOBUF_LIB     = ort-build/_deps/protobuf-build/libprotobuf-lite.a)

# 3. Configure
mkdir build && cd build
cmake ../cpp \
  -DUSE_BACKEND=ONNX \
  -DKATAGO_AUTO_FETCH_DEPS=OFF \
  -DONNXRUNTIME_ROOT=<ort-prebuilt-dir> \
  -DONNX_INCLUDE_DIR=<ort-build>/_deps/onnx-build \
  -DONNX_PROTO_LIB=<ort-build>/_deps/onnx-build/libonnx_proto.a \
  -DPROTOBUF_INCLUDE_DIR=<ort-build>/_deps/protobuf-src/src \
  -DPROTOBUF_LIB=<ort-build>/_deps/protobuf-build/libprotobuf-lite.a \
  -DCMAKE_CXX_FLAGS="-DONNX_ML"

# 4. Build → fails at onnxmodelbuilder.cpp
cmake --build . -j$(nproc)

System environment:
Item Value
OS Linux aarch64
Compiler GCC 15.2.0
ONNX Runtime v1.21.0
protobuf 3.21.12 (ORT bundled)
cmake 4.2.3
Fix:
In cpp/neuralnet/onnxmodelbuilder.cpp, change line 9:

-#include <onnx/onnx-ml.pb.h>
+#include <onnx/onnx_pb.h>

onnx_pb.h defines ONNX_API before including onnx-ml.pb.h, resolving the macro issue. Note: when using the new ONNX_INCLUDE_DIR cmake variable, onnx_pb.h must also be present in that directory (it lives in the ONNX source tree, not the build output). See also: ChinChangYang/KataGo#18.

Actually, my CMakeLists.txt deal with it well. I use vcpkg to deal with this deps.

"https://github.com/Looong01/KataGo-Multi-backends/blob/115e6daba5f8063fd70d7c89631f123cccced902/cpp/CMakeLists.txt".

Or, do u still think I need to do this change?

I think you misunderstood my comment. The reproduction steps fetch #1171, exact this PR, not mine.

@Looong01
Copy link
Author

Claude detects an issues in a Docker container.
Bug: onnxmodelbuilder.cpp fails to compile on Linux/GCC — ONNX_API macro undefined
Error message:

/path/to/_deps/onnx-build/onnx/onnx-ml.pb.h:42:17:
error: variable 'ONNX_API TableStruct_onnx_2fonnx_2dml_2eproto' has initializer but incomplete type
   42 | struct ONNX_API TableStruct_onnx_2fonnx_2dml_2eproto {
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

/path/to/_deps/onnx-build/onnx/onnx-ml.pb.h:43:3:
error: expected primary-expression before 'static'
   43 |   static const uint32_t offsets[];

/path/to/_deps/onnx-build/onnx/onnx-ml.pb.h:48:51:
error: expected initializer before '_AttributeProto_default_instance_'
   48 | ONNX_API extern AttributeProtoDefaultTypeInternal _AttributeProto_default_instance_;

Root cause:
onnxmodelbuilder.cpp includes <onnx/onnx-ml.pb.h> directly, bypassing onnx/onnx_pb.h which defines the ONNX_API macro. When ONNX_API is undefined, the compiler treats it as an identifier rather than an attribute specifier, breaking the struct/extern declarations in the generated protobuf header.
Reproduction steps:

# 1. Clone and checkout this PR branch
git clone https://github.com/lightvector/KataGo.git
cd KataGo
git fetch origin pull/1171/head:pr-1171
git checkout pr-1171

# 2. Download ORT prebuilt + build onnx_proto and protobuf-lite from source
#    (ONNXRUNTIME_ROOT = prebuilt ORT package dir)
#    (ONNX_INCLUDE_DIR = ort-build/_deps/onnx-build)
#    (ONNX_PROTO_LIB   = ort-build/_deps/onnx-build/libonnx_proto.a)
#    (PROTOBUF_INCLUDE_DIR = ort-build/_deps/protobuf-src/src)
#    (PROTOBUF_LIB     = ort-build/_deps/protobuf-build/libprotobuf-lite.a)

# 3. Configure
mkdir build && cd build
cmake ../cpp \
  -DUSE_BACKEND=ONNX \
  -DKATAGO_AUTO_FETCH_DEPS=OFF \
  -DONNXRUNTIME_ROOT=<ort-prebuilt-dir> \
  -DONNX_INCLUDE_DIR=<ort-build>/_deps/onnx-build \
  -DONNX_PROTO_LIB=<ort-build>/_deps/onnx-build/libonnx_proto.a \
  -DPROTOBUF_INCLUDE_DIR=<ort-build>/_deps/protobuf-src/src \
  -DPROTOBUF_LIB=<ort-build>/_deps/protobuf-build/libprotobuf-lite.a \
  -DCMAKE_CXX_FLAGS="-DONNX_ML"

# 4. Build → fails at onnxmodelbuilder.cpp
cmake --build . -j$(nproc)

System environment:
Item Value
OS Linux aarch64
Compiler GCC 15.2.0
ONNX Runtime v1.21.0
protobuf 3.21.12 (ORT bundled)
cmake 4.2.3
Fix:
In cpp/neuralnet/onnxmodelbuilder.cpp, change line 9:

-#include <onnx/onnx-ml.pb.h>
+#include <onnx/onnx_pb.h>

onnx_pb.h defines ONNX_API before including onnx-ml.pb.h, resolving the macro issue. Note: when using the new ONNX_INCLUDE_DIR cmake variable, onnx_pb.h must also be present in that directory (it lives in the ONNX source tree, not the build output). See also: ChinChangYang/KataGo#18.

Actually, my CMakeLists.txt deal with it well. I use vcpkg to deal with this deps.
"https://github.com/Looong01/KataGo-Multi-backends/blob/115e6daba5f8063fd70d7c89631f123cccced902/cpp/CMakeLists.txt".
Or, do u still think I need to do this change?

I think you misunderstood my comment. The reproduction steps fetch #1171, exact this PR, not mine.

But I don't meet any error when I compile it. Maybe only happen with GCC-15?

@ChinChangYang
Copy link
Contributor

ChinChangYang commented Mar 26, 2026

Actually, my CMakeLists.txt deal with it well. I use vcpkg to deal with this deps.

"https://github.com/Looong01/KataGo-Multi-backends/blob/115e6daba5f8063fd70d7c89631f123cccced902/cpp/CMakeLists.txt".

Or, do u still think I need to do this change?

I think you misunderstood my comment. The reproduction steps fetch #1171, exact this PR, not mine.

But I don't meet any error when I compile it. Maybe only happen with GCC-15?

11433e6 resolves the issue. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants