Skip to content

Commit 449f6b8

Browse files
authored
Tag 0.0.9 and update versions to 24.06 (#79)
1 parent 2375bc4 commit 449f6b8

File tree

3 files changed

+11
-10
lines changed

3 files changed

+11
-10
lines changed

README.md

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,8 @@ and running the CLI from within the latest corresponding `tritonserver`
2222
container image, which should have all necessary system dependencies installed.
2323

2424
For vLLM and TRT-LLM, you can use their respective images:
25-
- `nvcr.io/nvidia/tritonserver:24.05-vllm-python-py3`
26-
- `nvcr.io/nvidia/tritonserver:24.05-trtllm-python-py3`
25+
- `nvcr.io/nvidia/tritonserver:24.06-vllm-python-py3`
26+
- `nvcr.io/nvidia/tritonserver:24.06-trtllm-python-py3`
2727

2828
If you decide to run the CLI on the host or in a custom image, please
2929
see this list of [additional dependencies](#additional-dependencies-for-custom-environments)
@@ -38,6 +38,7 @@ matrix below:
3838

3939
| Triton CLI Version | TRT-LLM Version | Triton Container Tag |
4040
|:------------------:|:---------------:|:--------------------:|
41+
| 0.0.9 | v0.10.0 | 24.06 |
4142
| 0.0.8 | v0.9.0 | 24.05 |
4243
| 0.0.7 | v0.9.0 | 24.04 |
4344
| 0.0.6 | v0.8.0 | 24.02, 24.03 |
@@ -55,7 +56,7 @@ It is also possible to install from a specific branch name, a commit hash
5556
or a tag name. For example to install `triton_cli` with a specific tag:
5657

5758
```bash
58-
GIT_REF="0.0.8"
59+
GIT_REF="0.0.9"
5960
pip install git+https://github.com/triton-inference-server/triton_cli.git@${GIT_REF}
6061
```
6162

@@ -90,7 +91,7 @@ triton -h
9091
triton import -m gpt2
9192

9293
# Start server pointing at the default model repository
93-
triton start --image nvcr.io/nvidia/tritonserver:24.05-vllm-python-py3
94+
triton start --image nvcr.io/nvidia/tritonserver:24.06-vllm-python-py3
9495

9596
# Infer with CLI
9697
triton infer -m gpt2 --prompt "machine learning is"
@@ -144,10 +145,10 @@ docker run -ti \
144145
--shm-size=1g --ulimit memlock=-1 \
145146
-v ${HOME}/models:/root/models \
146147
-v ${HOME}/.cache/huggingface:/root/.cache/huggingface \
147-
nvcr.io/nvidia/tritonserver:24.05-vllm-python-py3
148+
nvcr.io/nvidia/tritonserver:24.06-vllm-python-py3
148149

149150
# Install the Triton CLI
150-
pip install git+https://github.com/triton-inference-server/[email protected].8
151+
pip install git+https://github.com/triton-inference-server/[email protected].9
151152

152153
# Authenticate with huggingface for restricted models like Llama-2 and Llama-3
153154
huggingface-cli login
@@ -213,10 +214,10 @@ docker run -ti \
213214
-v /tmp:/tmp \
214215
-v ${HOME}/models:/root/models \
215216
-v ${HOME}/.cache/huggingface:/root/.cache/huggingface \
216-
nvcr.io/nvidia/tritonserver:24.05-trtllm-python-py3
217+
nvcr.io/nvidia/tritonserver:24.06-trtllm-python-py3
217218

218219
# Install the Triton CLI
219-
pip install git+https://github.com/triton-inference-server/[email protected].8
220+
pip install git+https://github.com/triton-inference-server/[email protected].9
220221

221222
# Authenticate with huggingface for restricted models like Llama-2 and Llama-3
222223
huggingface-cli login

src/triton_cli/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,4 +24,4 @@
2424
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
2525
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
2626

27-
__version__ = "0.0.9dev"
27+
__version__ = "0.0.9"

src/triton_cli/docker/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ FROM nvcr.io/nvidia/tritonserver:24.06-trtllm-python-py3
33

44
# Setup vLLM Triton backend
55
RUN mkdir -p /opt/tritonserver/backends/vllm && \
6-
wget -P /opt/tritonserver/backends/vllm https://raw.githubusercontent.com/triton-inference-server/vllm_backend/main/src/model.py
6+
wget -P /opt/tritonserver/backends/vllm https://raw.githubusercontent.com/triton-inference-server/vllm_backend/r24.06/src/model.py
77

88
# vLLM runtime dependencies
99
RUN pip install "vllm==0.4.3"

0 commit comments

Comments
 (0)