The easiest LLM command-line tool. Remember credentials for multiple providers, switch between them instantly, and chat in your terminal.
- 🔌 Ollama (native API) — no key needed, works out of the box
- 🤖 OpenAI-compatible — OpenAI, vLLM, LiteLLM, LocalAI, Groq, etc.
- 💾 Named presets — save your server/model configs, switch with one flag
- 💬 Interactive REPL — multi-turn conversation with history
- ⚡ Single-shot mode — pipe-friendly, great for scripts
- 🔑 API key storage — secure (
~/.config/llm-cli/presets.json, mode 600)
pip install -e .Python 3.10+ required. Only dependency:
httpx>=0.27.
# Single-shot (default preset is Ollama on localhost)
llm --prompt "How long do bees live?"
# Specify preset and model explicitly
llm --preset ollama --model glm-5:cloud --prompt "Explain quantum entanglement in one sentence"
# Use a system prompt
llm --system "You are a pirate. Respond only in pirate speak." --prompt "What is the weather today?"
# Point directly at any OpenAI-compatible server
llm --url http://my-vllm:8000 --api-type openai --api-key mytoken --model meta-llama/Llama-3-8b-instruct --prompt "Hello"
# Interactive shell (no --prompt)
llm
llm --preset ollama --model glm-5:cloud$ llm
llm-cli v0.1.0 — interactive shell
Preset: ollama | Model: glm-5:cloud | URL: http://localhost:11434
Type a message to chat, or /help for commands.
Press Ctrl-C or type /exit to quit.
> How long do bees live for?
Worker bees live for about 6 weeks during summer...
> /preset list
NAME TYPE URL MODEL
─────────────────── ──────── ─────────────────────────────────── ────────────────────
ollama ollama http://localhost:11434 glm-5:cloud
> /preset add
── Add a new preset ──
Preset name: my-openai
Server URL: https://api.openai.com/v1
API type [openai]: openai
API key: sk-...
Default model: gpt-4o
Default system prompt: You are a helpful assistant.
✓ Preset 'my-openai' saved.
> /preset use my-openai
✓ Switched to preset 'my-openai' (model: gpt-4o)
> /model gpt-4o-mini
✓ Model set to 'gpt-4o-mini'
> /system You are an expert Python developer.
✓ System prompt set.
> /clear
✓ Conversation history cleared.
> /exit
Bye! 👋
| Command | Description |
|---|---|
/help |
Show all commands |
/exit or /quit |
Exit the shell |
/status |
Show current preset, model, system prompt, history length |
/clear |
Clear conversation history |
/preset list |
List all saved presets |
/preset add |
Add a new preset (interactive wizard) |
/preset use NAME |
Switch to a different preset |
/preset remove NAME |
Delete a preset |
/model [NAME] |
Show or set the current model |
/system [TEXT] |
Show or set the system prompt |
usage: llm [OPTIONS]
options:
-V, --version Show version
-h, --help Show help
connection:
-p, --preset NAME Preset name (default: 'ollama')
-u, --url URL Override server URL directly
--api-type TYPE API type with --url: 'ollama' or 'openai'
-k, --api-key KEY API key override
request:
-m, --model MODEL Model name override
-s, --system TEXT System prompt
--prompt TEXT One-shot prompt (omit for interactive shell)
--no-stream Disable streaming
Presets are stored in ~/.config/llm-cli/presets.json (permissions 600).
{
"ollama": {
"name": "ollama",
"api_url": "http://localhost:11434",
"api_type": "ollama",
"api_key": null,
"default_model": "glm-5:cloud",
"default_system_prompt": null
}
}{
"ollama": { ... },
"openai": {
"name": "openai",
"api_url": "https://api.openai.com/v1",
"api_type": "openai",
"api_key": "sk-...",
"default_model": "gpt-4o"
},
"my-vllm": {
"name": "my-vllm",
"api_url": "http://my-server:8000",
"api_type": "openai",
"api_key": "token-xyz",
"default_model": "meta-llama/Llama-3-8B"
}
}| Server | api_type |
Notes |
|---|---|---|
| Ollama | ollama |
Uses /api/chat NDJSON streaming |
| OpenAI | openai |
Requires API key |
| vLLM | openai |
OpenAI-compatible |
| LiteLLM | openai |
OpenAI-compatible |
| LocalAI | openai |
OpenAI-compatible |
| Groq | openai |
Base URL: https://api.groq.com/openai/v1 |
| Together AI | openai |
OpenAI-compatible |
MIT