vLLM Master Gateway
OpenAI-compatible unified host/port with priority routing, least-busy selection, and live metrics.
--
loading...
auto refresh on
Current Running Requests
Live inflight requests using upstream usage token tracking
| Request ID | IP | Username | Started (Local) | Model | Endpoint | Stream | Elapsed | TTFT | Input Tokens | Output Tokens |
Master API Keys
Client auth for /v1/*
One API key per line. Saving here replaces the accepted master keys. Values are stored outside config.yaml.
Models and Upstream Endpoints
Read-only runtime view
Endpoint changes should be made through config.yaml or the raw configuration editor below.
Per-model Metrics (Today)
Aggregated by public model name
| Model | Req | Success | Prompt | Gen | Avg Lat | Avg TTFT | Avg tok/s |
Per-endpoint Metrics (Today)
Aggregated by routed upstream
| Endpoint | Tier | Req | Success | Prompt | Gen | Avg Lat | Avg TTFT | Avg tok/s |
Per-IP Metrics (Today)
Useful for colleague usage analysis
| IP | Req | Success | Prompt | Gen | Avg Lat | Avg TTFT | Avg tok/s | Last Seen (Local) |
Recent Requests
Latest 50 requests
| IP | Username | Model | Endpoint | Status | Stream | Prompt | Gen | Latency | TTFT | tok/s | Error | Local Time |
Config File Editor
Pause auto refresh before editing. The master also reloads external file changes automatically.
Admin password, admin session secret, and master API keys are stored separately from config.yaml.
Loading config…