O R B Y T

A self-hosted LLM API gateway.

_{One unified endpoint between your application and every major AI provider.}

Project Overview ✦ Key Features ✦ Architecture ✦ Diagrams ✦ Structure ✦ Installation ✦ API ✦ Tech Stack

◈ Live Deployment

Service	URL
Dashboard	openrouter-clone-dashboard.vercel.app
Docs	openrouter-clone-docs.vercel.app
API Gateway	openrouter-clone-api-gateway.onrender.com
Primary Backend	orbyt-primary-backend.onrender.com

◈ Project Overview

A unified proxy layer that centralizes access to Large Language Models. Instead of managing complex integration with multiple provider SDKs, handling inconsistent streaming outputs, or writing brittle fallback logic to handle provider outages, you point your application to a single endpoint.

The gateway absorbs the complexity of network failures, latency spikes, and routing logic. If a primary model fails, the gateway immediately reroutes the execution to a secondary model. Uptime is preserved structurally and the client never sees the error.

◈ Key Features

Core Capabilities

`01` Model Fallback	`02` Provider Selection
If a target model returns an error (rate limits, downtime, context violations), the gateway automatically tries the next model in a configured priority list.	Before sending a request, the system evaluates available providers. Route prompts dynamically based on strategy (e.g., `cheapest` or `fastest`).

`03` Retry Policy	`04` Streaming
Configurable retry behavior before escalating to full model fallback. Handles transient network errors gracefully using explicit attempts and delay logic.	Real-time token streaming via Server Sent Events. The gateway unifies provider-specific chunk formatting into a single, predictable interface.

Developer Tooling

DevTools Tracing Session See your request in real-time as it moves through the system—from pending to completion. Gain deep visibility into the full lifecycle of every execution within a dedicated tracing session.

Live Status Updates: Real-time tracking of pending, success, and error states.
Payload Visibility: Full request and response transmission details.
Performance: Latency metrics and exact token usage insights.
Routing: Visibility into retries and provider selection logic.

Everything is transparent, so you always know exactly what’s happening.

Coming Soon

Capability	Impact
Presets	Define model configs in the dashboard and apply them on the fly using @preset in your SDK calls.
Budget Limits	Establish spending maximums per request or per active user.
Multimodality	Direct proxy compatibility for image inputs, PDF document analysis, and video.
Zero Insurance	If all fallback routes and retries fail entirely, the execution is never billed.
BYOK	Unbind yourself from billing by letting end-users provide their own API keys.

Implementation Comparison

Domain	Traditional Setup	Our Gateway
Integration	Maintaining 5+ SDKs and unique payload shapes	A single OpenAI-compatible endpoint
Reliability	Application crashes during provider outages	Automated model and provider fallbacks
Error Handling	Bloated blocks of retry code in application logic	Centralized routing and exponential backoff
Visibility	Blind faith until the monthly invoice arrives	Millisecond tracing and exact token counting

◈ System Architecture

Requests flow through highly structured layers. Processing logic is deterministic, isolating faults based on origin while utilizing high-throughput data stores to protect downstream limits.

flowchart LR
    Req([Client Request]) --> RL{Rate Limiter}

    RL -->|Limit Exceeded| Drop[Reject 429]
    RL -->|Approved| Strat[Provider Selector]

    Strat --> Pool[(Redis Key Pool)]
    Pool --> Exec[API Execution]

    Exec -.->|Log Traces| PG[(PostgreSQL)]

    Exec -.->|Transient 5xx| Retry((Retry Policy))
    Retry -.->|Wait & Retry| Exec

    Exec ==>|Success| Stream(((Normalize & Stream)))

    Exec -.->|Hard Error| Engine{Decision Engine}

    Engine -->|Provider Exhausted| Pool
    Engine -->|Model Exhausted| Strat
    Engine -->|Client Error 4xx| DropReq[Reject: Inform User]
    Engine -->|Config Error 5xx| Alert[Alert Dev + Inform User]

When a request enters the gateway, it is first evaluated by a Global Rate Limiter. If traffic bounds are respected, the Provider Selector evaluates your fallback configurations to pick the optimal mathematical route (cheapest, fastest, etc.).

The execution runtime then leases the healthiest available API key from the Redis Key Pool (ranked dynamically by remaining TPM/RPM capacity) to hit the LLM provider.

If the API execution encounters an anomaly:

Transient network errors trigger your designated retry policy with specific delays.
Hard provider failures are intercepted by the Decision Engine. The engine temporarily evicts the bad key and cycles to the Provider Exhausted queue, or re-evaluates a new provider entirely (Model Exhausted).
Bad parameters (4xx) are bounced directly back to the client.
Internal gateway errors (5xx) notify engineering telemetry while returning a safe failure state to the client.

All telemetry records and trace logs are saved asynchronously to PostgreSQL.

◈ System Architecture Diagrams

View System Diagrams

UML Diagrams

Class Diagram

Sequence Diagram

Usecase Diagram

Database Schema

ER Diagram

◈ Project Structure

/
├── apps
│   ├── api-gateway/    — Execution router and decision logic
│   ├── dashboard/      — Administrative interface for telemetry
│   ├── devtools/       — Traces UI and system debug views
│   └── primary-backend/— Authentication and configuration state
├── packages
│   ├── config/         — Centralized system configurations
│   ├── db/             — Prisma data layer and PostgreSQL schemas
│   ├── eslint-config/  — Monorepo linting synchronization
│   ├── types/          — Inter-service TypeScript definitions
│   ├── typescript-config/ — Shared TS compilation settings
│   ├── ui/             — Shared React component library
│   └── utils/          — Standardized helper libraries
└── turbo.json          — Build orchestration

◈ Installation

Prerequisites: Node.js v18+, PostgreSQL.

Initial Setup

1. Clone the repository

git clone https://github.com/Srijan76-code/openrouter-clone.git
cd openrouter-clone
npm install

2. Configure Environment Duplicate .env.example to .env.

DATABASE_URL="postgresql://user:password@localhost:5432/gateway"
PORT="4000"

3. Initialize Database

npx turbo run db:generate
npx turbo run db:push

4. Start

npm run dev

5. Active Port Mapping

Application	Local Port
Dashboard	`3000`
API Gateway	`3001`
Primary Backend	`4000`
DevTools	`4983`

Tip: Traces have a dedicated UI. Open localhost:4983 in your browser, run a request via Postman to the API Gateway, and watch the telemetry populate in real-time.

◈ API Configuration

Harnessing the routing engine requires minimal declarative configuration inside standard structures.

import OpenAI from "openai";

const openai = new OpenAI({
  baseURL: "https://openrouter-clone-api-gateway.onrender.com/v1",
  apiKey: "gateway-sk-12345",
});

const response = await openai.chat.completions.create({
      model: "google/gemini-3.1-pro", // Primary Model
      messages: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: "What is the capital of germany?" },
      ],
      temperature: 0.7,
      stream: false,

      // --- OPENROUTER CUSTOM EXTENSIONS ---
      extra: {
        fallback_models: [
          "anthropic/claude-3-haiku",
          "google/gemini-2.5-flash",
        ],
        provider: "cheap", // Override standard routing mechanism
        retry: 3, // Set custom retry handler count
      },
    });

console.log(response)

◈ Tech Stack

Domain	Technology	Implementation Objective
API Gateway	Node.js & Express	Proxying high-throughput streams and evaluating error limits.
Type Safety	TypeScript	Structuring rigid data contracts across internal monorepo packages.
Monorepo	Turborepo	Facilitating isolated builds and rapid cache-hitting deployments.
Database	PostgreSQL & Prisma	Relational data persistence for telemetry and configuration states.
Dashboard	Next.js 14	Delivering a lightweight React interface for configuration tracking.

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
apps		apps
docs		docs
packages		packages
.gitignore		.gitignore
.npmrc		.npmrc
ORBYT_Project_Report.pdf		ORBYT_Project_Report.pdf
README.md		README.md
diagrams.md		diagrams.md
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
tsconfig.base.json		tsconfig.base.json
turbo.json		turbo.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

O R B Y T

◈ Live Deployment

◈ Project Overview

◈ Key Features

Core Capabilities

Developer Tooling

Coming Soon

Implementation Comparison

◈ System Architecture

◈ System Architecture Diagrams

UML Diagrams

Database Schema

◈ Project Structure

◈ Installation

◈ API Configuration

◈ Tech Stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

O R B Y T

◈ Live Deployment

◈ Project Overview

◈ Key Features

Core Capabilities

Developer Tooling

Coming Soon

Implementation Comparison

◈ System Architecture

◈ System Architecture Diagrams

UML Diagrams

Database Schema

◈ Project Structure

◈ Installation

◈ API Configuration

◈ Tech Stack

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages