Back to projects
AI Infrastructure2024
LLM Gateway
Production AI Infrastructure with Observability & Guardrails
My role
AI Infrastructure Engineer
Built to solve the chaos of working with multiple LLM providers in production. The gateway provides a unified API surface, multi-model routing (GPT-4, Claude, Gemini), and full observability into every prompt and completion.
Key features include content moderation pipelines, input/output guardrails, per-model cost tracking, latency dashboards, and experiment tooling to A/B test model outputs. Langfuse integration gives full trace-level visibility into every LLM call.
Key highlights
- Multi-model routing: OpenAI, Gemini, Claude via LiteLLM
- Full observability with Langfuse — traces, cost, latency
- Input/output guardrails and content moderation pipelines
- Model comparison tooling for quality evaluation
- Centralized prompt management and versioning
Tech stack
LiteLLMLangfuseOpenAI APIGeminiNestJSDockerAWS