Skip to content

chore(pricing): Update vertex-ai pricing#409

Open
siddharthsambharia-portkey wants to merge 1 commit intomainfrom
pricing-update/vertex-ai-20260312112803-48l7q1
Open

chore(pricing): Update vertex-ai pricing#409
siddharthsambharia-portkey wants to merge 1 commit intomainfrom
pricing-update/vertex-ai-20260312112803-48l7q1

Conversation

@siddharthsambharia-portkey
Copy link
Collaborator

🔄 Pricing Update: vertex-ai

📊 Summary (complete_diff mode)

Change Type Count
➕ Models added 0
🔄 Models updated (merged) 7

🔄 Updated Models

  • gemini-3.1-pro-preview
  • gemini-3.1-flash-image-preview
  • gemini-3.1-flash-lite-preview
  • gemini-3-pro-image-preview
  • gemini-3-flash-preview
  • veo-3.1-fast-generate-001
  • veo-3.0-fast-generate-preview

Model → Pricing Page Mapping

Google – Gemini 3

Model ID Publisher / Section Source Notes
gemini-3.1-pro-preview Google – Gemini 3 API $2/$12, cache $0.2, batch $1/$6, image_token $120/1M; web_search/enterprise $14/1000
gemini-3.1-flash-image-preview Google – Gemini 3 API $0.5/$3, batch $0.25/$1.5, image_token $60/1M
gemini-3.1-flash-lite-preview Google – Gemini 3 API $0.25/$1.5, cache $0.025, batch $0.125/$0.75
gemini-3-pro-preview Google – Gemini 3 API $2/$12, cache $0.2, batch $1/$6, image_token $120/1M
gemini-3-pro-image-preview Google – Gemini 3 API Matched to Gemini 3 Pro pricing (image variant); $2/$12, cache $0.2, batch $1/$6, image_token $120/1M
gemini-3-flash-preview Google – Gemini 3 API $0.5/$3, cache $0.05, batch $0.25/$1.5

Google – Gemini 2.5

Model ID Publisher / Section Source Notes
gemini-2.5-pro Google – Gemini 2.5 API $1.25/$10, cache $0.125, batch $0.625/$5; web_search $35/1000, enterprise $45/1000
gemini-2.5-computer-use-preview-10-2025 Google – Gemini 2.5 API Matched to Gemini 2.5 Pro Computer Use-Preview; $1.25/$10, no cache/batch listed
gemini-2.5-flash Google – Gemini 2.5 API $0.30/$2.50, cache $0.030, batch $0.15/$1.25, image_token $30/1M
gemini-2.5-flash-preview-09-2025 Google – Gemini 2.5 API Preview alias → same pricing as gemini-2.5-flash
gemini-2.5-flash-lite Google – Gemini 2.5 API $0.10/$0.40, cache $0.010, batch $0.05/$0.20
gemini-2.5-flash-lite-preview-09-2025 Google – Gemini 2.5 API Preview alias → same pricing as gemini-2.5-flash-lite
gemini-2.5-flash-image Google – Gemini 2.5 API Image output variant → same pricing as gemini-2.5-flash
gemini-2.5-flash-image-preview Google – Gemini 2.5 API Image output preview → same pricing as gemini-2.5-flash

Google – Gemini 2.0

Model ID Publisher / Section Source Notes
gemini-2.0-flash-001 Google – Gemini 2.0 API $0.15/$0.60, batch $0.075/$0.30; web_search $35/1000, enterprise $45/1000
gemini-2.0-flash-lite-001 Google – Gemini 2.0 API $0.075/$0.30, batch $0.0375/$0.15

Google – Embedding

Model ID Publisher / Section Source Notes
gemini-embedding-001 Google – Embedding API $0.00015/1K tokens input; batch $0.00012/1K
gemini-embedding-2-preview Google – MultiModal Embeddings API $0.2/1M text tokens; image $0.00012/image, video $0.00079/frame, audio $0.00016/sec
text-embedding-005 Google – Embedding API $0.000025/1K chars (mapped as $/1K tokens)
text-multilingual-embedding-002 Google – Embedding API $0.000025/1K chars (mapped as $/1K tokens)
text-embedding-large-exp-03-07 Google – Embedding API Experimental; shares text embedding pricing $0.000025/1K tokens
textembedding-gecko Google – Embedding API Legacy; shares text embedding pricing $0.000025/1K tokens
multimodalembedding Google – MultiModal Embeddings API Per-image $0.0001, video-plus $0.0020/sec, standard $0.0010/sec, essential $0.0005/sec

Google – Imagen

Model ID Publisher / Section Source Notes
imagen-4.0-ultra-generate-001 Google – Imagen API $0.06/image; row matched via lookup_variant imagen-4.0-ultra-generate
imagen-4.0-ultra-generate-preview-06-06 Google – Imagen API Preview variant; same pricing as ultra $0.06/image
imagen-4.0-generate-001 Google – Imagen API $0.04/image; row matched via lookup_variant imagen-4.0-generate
imagen-4.0-generate-preview-06-06 Google – Imagen API Preview variant; same pricing $0.04/image
imagen-4.0-fast-generate-001 Google – Imagen API $0.02/image; row matched via lookup_variant imagen-4.0-fast-generate
imagen-4.0-fast-generate-preview-06-06 Google – Imagen API Preview variant; same pricing $0.02/image
imagen-3.0-generate-002 Google – Imagen API $0.04/image (Imagen 3); row matched via lookup_variant imagen-3.0-generate
imagen-3.0-capability-001 Google – Imagen API Capability model → uses equivalent Imagen 3 generate price $0.04/image
imagen-3.0-capability-002 Google – Imagen API Capability model → uses equivalent Imagen 3 generate price $0.04/image
imagen-product-recontext-preview-06-30 Google – Imagen API Product Recontext $0.12/image

Google – Veo

Model ID Publisher / Section Source Notes
veo-3.1-generate-001 Google – Veo API $0.20/sec (720p/1080p); default 8s, 1 sample
veo-3.1-generate-preview Google – Veo API Preview alias → same $0.20/sec
veo-3.1-fast-generate-001 Google – Veo API $0.10/sec
veo-3.1-fast-generate-preview Google – Veo API Preview alias → same $0.10/sec
veo-3.0-generate-001 Google – Veo API $0.20/sec
veo-3.0-generate-preview Google – Veo API Preview alias → same $0.20/sec
veo-3.0-fast-generate-001 Google – Veo API $0.10/sec
veo-3.0-fast-generate-preview Google – Veo API Preview alias → same $0.10/sec
veo-2.0-generate-001 Google – Veo API $0.50/sec

Google – Excluded (non-generative / global excludes)

Model ID Reason
lyria-002 lyria-* music generation — excluded by global rule
gemini-2.5-flash-*-live-* / Live API models -live- streaming — excluded by global rule
shieldgemma2 guard/safety model
pretrained-ocr OCR model
virtual-try-on-001 virtual-try-on product model — excluded by Google ref
imagegeneration Legacy superseded by Imagen 3+ — excluded by Google ref
translate-llm, text-translation, t5gemma Translation/non-generative inference
gemma, gemma2, gemma3, gemma3n, paligemma, codegemma, functiongemma, translategemma, embeddinggemma self-deploy only — excluded
multimodalembedding variants Traditional CV/NLP
chirp-2, chirp-3, medasr Audio transcription
video-text-detection, video-speech-transcription Non-generative ML
bert-*, t5-*, resnet50, vit-jax, f-vlm-jax Non-generative ML
All imageclassification-*, imageobjectdetection-*, imagesegmentation-* CV classification/detection
occupancy-analytics, vehicle-detector, ppe-detector, people-blur, product-recognizer, tag-recognizer, object-detector, face-detector, dito, jax-owl-vit-v2 Non-generative CV
timesfm, weathernext, weather-next-v2 Forecasting/non-generative
path-foundation, derm-foundation, txgemma, hear, medgemma, medsiglip Medical/self-deploy
earth-ai-imagery-* Experimental earth imagery — non-generative
content-moderation, language-v1-* Moderation/NLP
imagetext, imagewatermarkdetector, image-segmentation-001 Non-generative or detection
pretrained-form-parser, label-detector-pali-001, tab-net, text-detector, bart-large-cnn, vit-jax, pic2word, cloudnerf-pytorch-zipnerf, derm-foundation, cxr-foundation, mammut Self-deploy or non-generative
pt-test, automl-*, tfvision-*, keras-yolov8 Internal/AutoML/CV

Anthropic – Claude

Model ID Publisher / Section Source Notes
claude-opus-4-6 Anthropic – Claude API Stripped @default; $5/$25, cache_write $6.25 (5m), cache_read $0.50, batch $2.50/$12.50
claude-sonnet-4-6 Anthropic – Claude API Stripped @default; $3/$15, cache_write $3.75, cache_read $0.30, batch $1.50/$7.50
claude-opus-4-5@20251101 Anthropic – Claude API $5/$25, cache_write $6.25, cache_read $0.50, batch $2.50/$12.50
claude-sonnet-4-5@20250929 Anthropic – Claude API $3/$15, cache_write $3.75, cache_read $0.30, batch $1.50/$7.50
claude-haiku-4-5@20251001 Anthropic – Claude API $1/$5, cache_write $1.25, cache_read $0.10, batch $0.50/$2.50
claude-opus-4-1@20250805 Anthropic – Claude API Uniform pricing; $15/$75, cache_write $18.75, cache_read $1.50, batch $7.50/$37.50
claude-opus-4@20250514 Anthropic – Claude API Uniform pricing; $15/$75, cache_write $18.75, cache_read $1.50, batch $7.50/$37.50
claude-sonnet-4@20250514 Anthropic – Claude API Uniform pricing; $3/$15, cache_write $3.75, cache_read $0.30, batch $1.50/$7.50

OpenAI

Model ID Publisher / Section Source Notes
gpt-oss-120b-maas OpenAI API Matched to gpt-oss-120b on page; $0.09/$0.36, batch $0.045/$0.18
clip-vit-base-patch32 OpenAI API – excluded Self-deploy + non-generative embedding
openclip OpenAI API – excluded Non-generative embedding
whisper-large OpenAI API – excluded Audio transcription — excluded by global rule
gpt-oss OpenAI API – excluded Self-deploy only

Meta – Llama

Model ID Publisher / Section Source Notes
llama-3.3-70b-instruct-maas Meta – Llama API Matched to Llama 3.3 70B; $0.72/$0.72, batch $0.36/$0.36
llama-4-maverick-17b-128e-instruct-maas Meta – Llama API Matched to Llama 4 Maverick; $0.35/$1.15, batch $0.175/$0.575
llama-guard Meta – Llama API – excluded Guard model — excluded by global rule
prompt-guard Meta – Llama API – excluded Guard model — excluded by global rule
sam3 Meta – Llama API – excluded SAM3 segmentation — non-generative CV
faster-r-cnn, retinanet, mask-r-cnn, segment-anything Meta API – excluded Non-generative CV
xlm-roberta-large, roberta-large Meta API – excluded Non-generative NLP, self-deploy
codellama-7b-hf, llama2, nllb, imagebind, llama-2-quantized, llama3, llama3_1, llama4, llama3-2, llama3-3 Meta API – excluded Self-deploy only

Mistral AI

Model ID Publisher / Section Source Notes
mistral-small-2503 Mistral AI API Matched to Mistral Small 3.1 (25.03); $0.10/$0.30
mistral-medium-3 Mistral AI API $0.40/$2.00
codestral-2 Mistral AI API $0.30/$0.90
mistral-ocr-2505 Mistral AI API – excluded OCR model
codestral-2501-self-deploy Mistral AI API – excluded Self-deploy
ministral-3 Mistral AI API – excluded Self-deploy
mistral-large-3 Mistral AI API – excluded Self-deploy
mistral Mistral AI (mistral-ai) API – excluded Self-deploy
mixtral Mistral AI (mistral-ai) API – excluded Self-deploy

DeepSeek

Model ID Publisher / Section Source Notes
deepseek-r1-0528-maas DeepSeek API Matched to DeepSeek-R1 (0528); $1.35/$5.40, batch $0.675/$2.70
deepseek-v3.1-maas DeepSeek API Matched to DeepSeek-V3.1; $0.60/$1.70, cache $0.06, batch $0.30/$0.85
deepseek-v3.2-maas DeepSeek API Matched to DeepSeek-V3.2; $0.56/$1.68, cache $0.056, batch $0.28/$0.84
deepseek-r1 DeepSeek API – excluded Self-deploy
deepseek-v3 DeepSeek API – excluded Self-deploy
deepseek-v3-1 DeepSeek API – excluded Self-deploy
deepseek-v3-2 DeepSeek API – excluded Self-deploy
deepseek-ocr DeepSeek API – excluded OCR model
deepseek-ocr-2 DeepSeek API – excluded OCR model, self-deploy
deepseek-ocr-maas DeepSeek API – excluded OCR model

Kimi / Moonshot

Model ID Publisher / Section Source Notes
kimi-k2-thinking-maas Moonshot – Kimi API Matched to Kimi-K2-Thinking; $0.60/$2.50, cache $0.06
kimi-k2 Moonshot API – excluded Self-deploy
kimi-k2-5 Moonshot API – excluded Self-deploy

MiniMax

Model ID Publisher / Section Source Notes
minimax-m2-maas MiniMax API Matched to MiniMax-M2; $0.30/$1.20, cache $0.03
minimax-m2 MiniMax API – excluded Self-deploy

ZAI / GLM

Model ID Publisher / Section Source Notes
glm-4.7-maas ZAI – GLM API Matched to GLM-4.7; $0.60/$2.20
glm-5-maas ZAI – GLM API Matched to GLM-5; $1/$3.20, cache $0.10
glm-4.7 ZAI – GLM API – excluded Self-deploy
glm-5 ZAI – GLM API – excluded Self-deploy
glm-4.5 ZAI – GLM API – excluded Self-deploy
glm-ocr ZAI – GLM API – excluded OCR model
glm-image ZAI – GLM API – excluded Image generation excluded by global policy

AI21

Model ID Publisher / Section Source Notes
jamba-large-1.6 AI21 API – excluded Self-deploy only; no MaaS variant; no AI21 section on pricing page

Generated by Pricing Agent on 2026-03-12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant