Minimum System Requirements for NVIDIA RAG Blueprint

The following are the system requirements for the NVIDIA RAG Blueprint.

OS Requirements

Ubuntu 22.04 OS

By default, this blueprint deploys the referenced NIM microservices locally. For this, you will require a minimum of:

2xH100
2xB200
3xA100 The blueprint can be also modified to use NIM microservices hosted by NVIDIA in NVIDIA API Catalog.

Following are the hardware requirements for each component. The reference code in the solution (glue code) is referred to as as the "pipeline".

The NIM and hardware requirements only need to be met if you are self-hosting them with default settings of RAG. See Using self-hosted NVIDIA NIM microservices.

Pipeline operation: 1x L40 GPU or similar recommended. It is needed for Milvus vector store database, as GPU acceleration is enabled by default.
LLM NIM: NVIDIA llama-3.3-nemotron-super-49b-v1
- For improved paralleled performance, we recommend 8x or more H100s/A100s/B200s for LLM inference.
Embedding NIM: Llama-3.2-NV-EmbedQA-1B-v2 Support Matrix
- The pipeline can share the GPU with the Embedding NIM, but it is recommended to have a separate GPU for the Embedding NIM for optimal performance.
Reranking NIM: llama-3_2-nv-rerankqa-1b-v2 Support Matrix
NVIDIA NIM for Image OCR: baidu/paddleocr
NVIDIA NIMs for Object Detection: