Skip to content

Srijan76-code/openrouter-clone

Repository files navigation


O R B Y T

A self-hosted LLM API gateway.

One unified endpoint between your application and every major AI provider.


Node.js TypeScript Next.js Prisma


Project OverviewKey FeaturesArchitectureDiagramsStructureInstallationAPITech Stack

◈ Live Deployment

Service URL
Dashboard openrouter-clone-dashboard.vercel.app
Docs openrouter-clone-docs.vercel.app
API Gateway openrouter-clone-api-gateway.onrender.com
Primary Backend orbyt-primary-backend.onrender.com

◈ Project Overview

A unified proxy layer that centralizes access to Large Language Models. Instead of managing complex integration with multiple provider SDKs, handling inconsistent streaming outputs, or writing brittle fallback logic to handle provider outages, you point your application to a single endpoint.

The gateway absorbs the complexity of network failures, latency spikes, and routing logic. If a primary model fails, the gateway immediately reroutes the execution to a secondary model. Uptime is preserved structurally and the client never sees the error.


◈ Key Features

Core Capabilities

01 Model Fallback 02 Provider Selection
If a target model returns an error (rate limits, downtime, context violations), the gateway automatically tries the next model in a configured priority list. Before sending a request, the system evaluates available providers. Route prompts dynamically based on strategy (e.g., cheapest or fastest).
03 Retry Policy 04 Streaming
Configurable retry behavior before escalating to full model fallback. Handles transient network errors gracefully using explicit attempts and delay logic. Real-time token streaming via Server Sent Events. The gateway unifies provider-specific chunk formatting into a single, predictable interface.

Developer Tooling

DevTools Tracing Session See your request in real-time as it moves through the system—from pending to completion. Gain deep visibility into the full lifecycle of every execution within a dedicated tracing session.

  • Live Status Updates: Real-time tracking of pending, success, and error states.
  • Payload Visibility: Full request and response transmission details.
  • Performance: Latency metrics and exact token usage insights.
  • Routing: Visibility into retries and provider selection logic.

Everything is transparent, so you always know exactly what’s happening.


Coming Soon

Capability Impact
Presets Define model configs in the dashboard and apply them on the fly using @preset in your SDK calls.
Budget Limits Establish spending maximums per request or per active user.
Multimodality Direct proxy compatibility for image inputs, PDF document analysis, and video.
Zero Insurance If all fallback routes and retries fail entirely, the execution is never billed.
BYOK Unbind yourself from billing by letting end-users provide their own API keys.

Implementation Comparison

Domain Traditional Setup Our Gateway
Integration Maintaining 5+ SDKs and unique payload shapes A single OpenAI-compatible endpoint
Reliability Application crashes during provider outages Automated model and provider fallbacks
Error Handling Bloated blocks of retry code in application logic Centralized routing and exponential backoff
Visibility Blind faith until the monthly invoice arrives Millisecond tracing and exact token counting

◈ System Architecture

Requests flow through highly structured layers. Processing logic is deterministic, isolating faults based on origin while utilizing high-throughput data stores to protect downstream limits.

flowchart LR
    Req([Client Request]) --> RL{Rate Limiter}

    RL -->|Limit Exceeded| Drop[Reject 429]
    RL -->|Approved| Strat[Provider Selector]

    Strat --> Pool[(Redis Key Pool)]
    Pool --> Exec[API Execution]

    Exec -.->|Log Traces| PG[(PostgreSQL)]

    Exec -.->|Transient 5xx| Retry((Retry Policy))
    Retry -.->|Wait & Retry| Exec

    Exec ==>|Success| Stream(((Normalize & Stream)))

    Exec -.->|Hard Error| Engine{Decision Engine}

    Engine -->|Provider Exhausted| Pool
    Engine -->|Model Exhausted| Strat
    Engine -->|Client Error 4xx| DropReq[Reject: Inform User]
    Engine -->|Config Error 5xx| Alert[Alert Dev + Inform User]
Loading

When a request enters the gateway, it is first evaluated by a Global Rate Limiter. If traffic bounds are respected, the Provider Selector evaluates your fallback configurations to pick the optimal mathematical route (cheapest, fastest, etc.).

The execution runtime then leases the healthiest available API key from the Redis Key Pool (ranked dynamically by remaining TPM/RPM capacity) to hit the LLM provider.

If the API execution encounters an anomaly:

  • Transient network errors trigger your designated retry policy with specific delays.
  • Hard provider failures are intercepted by the Decision Engine. The engine temporarily evicts the bad key and cycles to the Provider Exhausted queue, or re-evaluates a new provider entirely (Model Exhausted).
  • Bad parameters (4xx) are bounced directly back to the client.
  • Internal gateway errors (5xx) notify engineering telemetry while returning a safe failure state to the client.

All telemetry records and trace logs are saved asynchronously to PostgreSQL.


◈ System Architecture Diagrams

View System Diagrams

UML Diagrams

Class Diagram Class Diagram

Sequence Diagram Sequence Diagram

Usecase Diagram Usecase Diagram


Database Schema

ER Diagram ER Diagram


◈ Project Structure

/
├── apps
│   ├── api-gateway/    — Execution router and decision logic
│   ├── dashboard/      — Administrative interface for telemetry
│   ├── devtools/       — Traces UI and system debug views
│   └── primary-backend/— Authentication and configuration state
├── packages
│   ├── config/         — Centralized system configurations
│   ├── db/             — Prisma data layer and PostgreSQL schemas
│   ├── eslint-config/  — Monorepo linting synchronization
│   ├── types/          — Inter-service TypeScript definitions
│   ├── typescript-config/ — Shared TS compilation settings
│   ├── ui/             — Shared React component library
│   └── utils/          — Standardized helper libraries
└── turbo.json          — Build orchestration

◈ Installation

Prerequisites: Node.js v18+, PostgreSQL.

Initial Setup

1. Clone the repository

git clone https://github.com/Srijan76-code/openrouter-clone.git
cd openrouter-clone
npm install

2. Configure Environment Duplicate .env.example to .env.

DATABASE_URL="postgresql://user:password@localhost:5432/gateway"
PORT="4000"

3. Initialize Database

npx turbo run db:generate
npx turbo run db:push

4. Start

npm run dev

5. Active Port Mapping

Application Local Port
Dashboard 3000
API Gateway 3001
Primary Backend 4000
DevTools 4983

Tip: Traces have a dedicated UI. Open localhost:4983 in your browser, run a request via Postman to the API Gateway, and watch the telemetry populate in real-time.


◈ API Configuration

Harnessing the routing engine requires minimal declarative configuration inside standard structures.

import OpenAI from "openai";

const openai = new OpenAI({
  baseURL: "https://openrouter-clone-api-gateway.onrender.com/v1",
  apiKey: "gateway-sk-12345",
});

const response = await openai.chat.completions.create({
      model: "google/gemini-3.1-pro", // Primary Model
      messages: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: "What is the capital of germany?" },
      ],
      temperature: 0.7,
      stream: false,

      // --- OPENROUTER CUSTOM EXTENSIONS ---
      extra: {
        fallback_models: [
          "anthropic/claude-3-haiku",
          "google/gemini-2.5-flash",
        ],
        provider: "cheap", // Override standard routing mechanism
        retry: 3, // Set custom retry handler count
      },
    });

console.log(response)

◈ Tech Stack

Domain Technology Implementation Objective
API Gateway Node.js & Express Proxying high-throughput streams and evaluating error limits.
Type Safety TypeScript Structuring rigid data contracts across internal monorepo packages.
Monorepo Turborepo Facilitating isolated builds and rapid cache-hitting deployments.
Database PostgreSQL & Prisma Relational data persistence for telemetry and configuration states.
Dashboard Next.js 14 Delivering a lightweight React interface for configuration tracking.


Releases

No releases published

Packages

 
 
 

Contributors