Back to projects
AI Infrastructure2024

LLM Gateway

Production AI Infrastructure with Observability & Guardrails

My role
AI Infrastructure Engineer

Built to solve the chaos of working with multiple LLM providers in production. The gateway provides a unified API surface, multi-model routing (GPT-4, Claude, Gemini), and full observability into every prompt and completion.

Key features include content moderation pipelines, input/output guardrails, per-model cost tracking, latency dashboards, and experiment tooling to A/B test model outputs. Langfuse integration gives full trace-level visibility into every LLM call.

Key highlights

  • Multi-model routing: OpenAI, Gemini, Claude via LiteLLM
  • Full observability with Langfuse — traces, cost, latency
  • Input/output guardrails and content moderation pipelines
  • Model comparison tooling for quality evaluation
  • Centralized prompt management and versioning

Tech stack

LiteLLMLangfuseOpenAI APIGeminiNestJSDockerAWS