vLLM Master Gateway

OpenAI-compatible unified host/port with priority routing, least-busy selection, and live metrics.
-- loading... auto refresh on
Live Operations

Monitor live routing and backend health

Runtime status, active requests, and backend health are shown here for quick operational checks.

Current Running Requests

Live inflight requests using upstream usage token tracking
Request IDIPUsernameStarted (Local)ModelEndpointStreamElapsedTTFTInput TokensOutput Tokens

Master API Keys

Client auth for /v1/*
One API key per line. Saving here replaces the accepted master keys. Values are stored outside config.yaml.

Models and Upstream Endpoints

Read-only runtime view
Endpoint changes should be made through config.yaml or the raw configuration editor below.
Traffic Analytics

Inspect live request quality and load distribution

These tables focus on operational visibility for the current day across models, upstreams, requesters, and recent failures.

Per-model Metrics (Today)

Aggregated by public model name
ModelReqSuccessPromptGenAvg LatAvg TTFTAvg tok/s

Per-endpoint Metrics (Today)

Aggregated by routed upstream
EndpointTierReqSuccessPromptGenAvg LatAvg TTFTAvg tok/s

Per-IP Metrics (Today)

Useful for colleague usage analysis
IPReqSuccessPromptGenAvg LatAvg TTFTAvg tok/sLast Seen (Local)

Recent Requests

Latest 50 requests
IPUsernameModelEndpointStatusStreamPromptGenLatencyTTFTtok/sErrorLocal Time
Advanced Config

Edit the raw gateway configuration directly

This editor is best for advanced changes that are not yet exposed through the Web UI. Security-sensitive secrets are stored separately.

Config File Editor

Pause auto refresh before editing. The master also reloads external file changes automatically.
Admin password, admin session secret, and master API keys are stored separately from config.yaml.
Loading config…