Skip to content

Commit f3d5e9e

Browse files
build: new triton cli release to be compatible with the newer tritonserver 25.02 release, fix the pip install conflict issue (#114)
1 parent 706d1bd commit f3d5e9e

File tree

4 files changed

+18
-17
lines changed

4 files changed

+18
-17
lines changed

README.md

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -49,8 +49,8 @@ and running the CLI from within the latest corresponding `tritonserver`
4949
container image, which should have all necessary system dependencies installed.
5050

5151
For vLLM and TRT-LLM, you can use their respective images:
52-
- `nvcr.io/nvidia/tritonserver:25.01-vllm-python-py3`
53-
- `nvcr.io/nvidia/tritonserver:25.01-trtllm-python-py3`
52+
- `nvcr.io/nvidia/tritonserver:25.02-vllm-python-py3`
53+
- `nvcr.io/nvidia/tritonserver:25.02-trtllm-python-py3`
5454

5555
If you decide to run the CLI on the host or in a custom image, please
5656
see this list of [additional dependencies](#additional-dependencies-for-custom-environments)
@@ -65,6 +65,7 @@ matrix below:
6565

6666
| Triton CLI Version | TRT-LLM Version | Triton Container Tag |
6767
|:------------------:|:---------------:|:--------------------:|
68+
| 0.1.3 | v0.17.0.post1 | 25.02 |
6869
| 0.1.2 | v0.17.0.post1 | 25.01 |
6970
| 0.1.1 | v0.14.0 | 24.10 |
7071
| 0.1.0 | v0.13.0 | 24.09 |
@@ -88,7 +89,7 @@ It is also possible to install from a specific branch name, a commit hash
8889
or a tag name. For example to install `triton_cli` with a specific tag:
8990

9091
```bash
91-
GIT_REF="0.1.2"
92+
GIT_REF="0.1.3"
9293
pip install git+https://github.com/triton-inference-server/triton_cli.git@${GIT_REF}
9394
```
9495

@@ -123,7 +124,7 @@ triton -h
123124
triton import -m gpt2
124125

125126
# Start server pointing at the default model repository
126-
triton start --image nvcr.io/nvidia/tritonserver:25.01-vllm-python-py3
127+
triton start --image nvcr.io/nvidia/tritonserver:25.02-vllm-python-py3
127128

128129
# Infer with CLI
129130
triton infer -m gpt2 --prompt "machine learning is"
@@ -209,10 +210,10 @@ docker run -ti \
209210
--shm-size=1g --ulimit memlock=-1 \
210211
-v ${HOME}/models:/root/models \
211212
-v ${HOME}/.cache/huggingface:/root/.cache/huggingface \
212-
nvcr.io/nvidia/tritonserver:25.01-vllm-python-py3
213+
nvcr.io/nvidia/tritonserver:25.02-vllm-python-py3
213214

214215
# Install the Triton CLI
215-
pip install git+https://github.com/triton-inference-server/[email protected].2
216+
pip install git+https://github.com/triton-inference-server/[email protected].3
216217

217218
# Authenticate with huggingface for restricted models like Llama-2 and Llama-3
218219
huggingface-cli login
@@ -277,7 +278,7 @@ docker run -ti \
277278
nvcr.io/nvidia/tritonserver:25.01-trtllm-python-py3
278279

279280
# Install the Triton CLI
280-
pip install git+https://github.com/triton-inference-server/[email protected].2
281+
pip install git+https://github.com/triton-inference-server/[email protected].3
281282

282283
# Authenticate with huggingface for restricted models like Llama-2 and Llama-3
283284
huggingface-cli login
@@ -331,10 +332,10 @@ docker run -ti \
331332
-v /tmp:/tmp \
332333
-v ${HOME}/models:/root/models \
333334
-v ${HOME}/.cache/huggingface:/root/.cache/huggingface \
334-
nvcr.io/nvidia/tritonserver:25.01-trtllm-python-py3
335+
nvcr.io/nvidia/tritonserver:25.02-trtllm-python-py3
335336

336337
# Install the Triton CLI
337-
pip install git+https://github.com/triton-inference-server/triton_cli.git@main
338+
pip install git+https://github.com/triton-inference-server/triton_cli.git@0.1.3
338339

339340
# Authenticate with huggingface for restricted models like Llama-2 and Llama-3
340341
huggingface-cli login

pyproject.toml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -50,18 +50,18 @@ dependencies = [
5050
# Client deps - generally versioned together
5151
"grpcio>=1.67.0",
5252
# Use explicit client version matching genai-perf version for tagged release
53-
"tritonclient[all] == 2.51",
54-
"genai-perf @ git+https://github.com/triton-inference-server/perf_analyzer.git@r25.01#subdirectory=genai-perf",
53+
"tritonclient[all] == 2.55.0",
54+
"genai-perf @ git+https://github.com/triton-inference-server/perf_analyzer.git@r25.02#subdirectory=genai-perf",
5555
# Misc deps
5656
"directory-tree == 0.0.4", # may remove in future
5757
# https://github.com/docker/docker-py/issues/3256#issuecomment-2376439000
5858
"docker == 7.1.0",
5959
# TODO: rely on tritonclient to pull in protobuf and numpy dependencies?
6060
"numpy >=1.21,<2",
61-
"protobuf>=3.7.0",
61+
"protobuf>=5.29.3,<6.0dev",
6262
"prometheus-client == 0.19.0",
6363
"psutil >= 5.9.5", # may remove later
64-
"rich == 13.5.2",
64+
"rich >= 13.9.4",
6565
# TODO: Test on cpu-only machine if [cuda] dependency is an issue,
6666
"huggingface-hub >= 0.19.4",
6767
# Testing

src/triton_cli/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,4 +24,4 @@
2424
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
2525
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
2626

27-
__version__ = "0.1.2"
27+
__version__ = "0.1.3"

src/triton_cli/docker/Dockerfile

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,13 +25,13 @@
2525
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
2626

2727
# TRT-LLM image contains engine building and runtime dependencies
28-
FROM nvcr.io/nvidia/tritonserver:25.01-trtllm-python-py3
28+
FROM nvcr.io/nvidia/tritonserver:25.02-trtllm-python-py3
2929

3030
# Setup vLLM Triton backend
3131
RUN mkdir -p /opt/tritonserver/backends/vllm && \
32-
git clone -b r25.01 https://github.com/triton-inference-server/vllm_backend.git /tmp/vllm_backend && \
32+
git clone -b r25.02 https://github.com/triton-inference-server/vllm_backend.git /tmp/vllm_backend && \
3333
cp -r /tmp/vllm_backend/src/* /opt/tritonserver/backends/vllm && \
3434
rm -r /tmp/vllm_backend
3535

3636
# vLLM runtime dependencies
37-
RUN pip install "vllm==0.6.3.post1" "setuptools>=74.1.1"
37+
RUN pip install "vllm==0.7.0" "setuptools>=74.1.1"

0 commit comments

Comments
 (0)