lunar
MCP gateway focused on governance and security: policy enforcement, request inspection, and rate-limiting between agents and MCP servers. Sits between the model and the tool surface.
Lunar is what happens when you take “API gateway” seriously for the AI workload shape - agents calling external APIs, LLM endpoints, and MCP servers, all of which need to be governed, rate-limited, observed, and cost-tracked centrally. The framing the project leads with is the right one: as AI agents and autonomous workflows lean more on external APIs, you need a mediation layer that aggregates applications, agents, and the services they depend on into a single control point.
Two components ship in the same repo, designed to compose:
- Lunar Proxy - the core API gateway and control layer for outbound HTTP traffic.
- Lunar MCPX - a zero-code aggregator for multiple MCP servers, exposing them through a single unified gateway.
Pick the one that matches your problem, or run both.
What the proxy does on a single request
The capabilities the README highlights, framed by what they actually buy you:
- Live API traffic visibility - real-time metrics on latency, errors, cost, and token usage across all outbound traffic, including LLM and agent calls. The “what is this team spending and where” question with a real answer.
- AI-aware policy enforcement - control tool access, throttle agent actions, govern agentic traffic with fine-grained rules. The piece that lets you say “this agent can call OpenAI but not Anthropic” or “this user’s keys can’t make purchases.”
- Advanced traffic shaping - rate limits, retries, priority queues, circuit breakers. The standard gateway feature set, applied to outbound (not inbound) traffic.
- Cost & performance optimization - identify waste, smooth traffic peaks, reduce overuse of costly APIs.
- Centralized MCP aggregation - consolidate multiple MCP servers behind one gateway with unified security, observability, and management.
The mental model that distinguishes Lunar from a traditional gateway: most gateways sit in front of your API, mediating inbound traffic. Lunar sits in front of your apps and agents, mediating their outbound traffic to third parties. That inversion is what makes it the right shape for AI workloads, where the costly, observable, governable thing is the calls out to model providers and tool APIs, not the calls in to your service.
When to use Lunar Proxy vs Lunar MCPX
The split is about what you’re putting in front of:
- Lunar Proxy - generic outbound API traffic. LLM providers (OpenAI, Anthropic, Gemini), data APIs, third-party services. If your agents are calling REST endpoints, this is the layer.
- Lunar MCPX - MCP servers specifically. Aggregates multiple MCP servers behind one endpoint with namespace isolation, so your agents see one MCP gateway instead of N separate ones to configure.
If you’re running both LLM traffic and a fleet of MCP servers, you’ll likely want both pieces. They’re separate components in the same repo, so the cost of running both is mostly conceptual.
Where this fits in the gateway landscape
Lunar overlaps with ThinkWatch and mcpm.sh, and the differences are worth understanding:
- ThinkWatch is enterprise-shaped: dual-port architecture (gateway + admin console), 5-tier RBAC, AES-256-GCM at rest, ClickHouse audit logs, soft-delete with 30-day purge. Heavy stack (PostgreSQL + Redis + ClickHouse), justified at scale.
- mcpm.sh is the package manager: search, install, manage. Not in the data path at runtime.
- Lunar sits between these. It’s an actual gateway (in the data path), but the default deployment is lighter than ThinkWatch and the focus is more agentic-traffic-and-MCP than enterprise compliance.
The right choice depends on what you’re actually solving. If you need SOC2-shaped audit logs and SSO/OIDC, ThinkWatch is the clean fit. If you’re an AI-native team that wants observability and policy on outbound traffic without standing up a Postgres + Redis + ClickHouse stack, Lunar is closer to the operational shape you want.
Open source vs production
The project is MIT-licensed at the core - free for non-production and personal use - and the Proxy and MCPX repos are public on GitHub. The company offers guided onboarding and platform tiers for production deployments through lunar.dev.
That’s a normal source-available-with-paid-managed-version split. For evaluation, prototyping, and small-scale production, the open-source path is enough. For organisations that need the SLA, the support, and the enterprise features, the paid tier is the on-ramp.
When to reach for it
- AI workloads where agents are making expensive outbound calls to multiple model providers and you need cost attribution.
- Teams running multiple MCP servers and tired of configuring each one separately in every client.
- Production deployments that need traffic-shaping primitives (circuit breakers, retries, priority queues) for outbound traffic.
When not to
- Solo developers or hobby projects. The setup overhead isn’t justified by the workload.
- Workloads where you specifically need a model-format-converting gateway that speaks both OpenAI and Anthropic-style APIs natively. ThinkWatch is more direct for that pattern.
- Air-gapped environments. Lunar is designed to manage outbound traffic; if you have no outbound traffic, you have no problem to solve here.
What to read after the README
The actual configuration surface lives in the proxy and mcpx subdirectories of the repo, plus the docs at docs.lunar.dev. The README itself is light on operational specifics - the gateway and aggregator are real pieces of software, but you’ll want to walk through the docs site before evaluating against ThinkWatch or building your own. The 2-3 minute pitch is on the README; the 30-minute “is this the right tool” answer is in the docs and the example deployments.
MIT licensed at the core. Built and maintained by The Lunar Company.
Similar tools
- trace-mcp
MCP server with 138 tools and cross-language framework awareness (58 integrations across 81 languages). Indexes Laravel/Inertia/Vue, Rails/Hotwire, Django/HTMX edges so agents skip re-deriving call graphs. Decision memory links architectural choices to the code they're about. Local-first ONNX embeddings, optional LSP enrichment.
- Claude Code Analysis
82 docs and 15 diagrams mapping every major subsystem of Claude Code's accidentally exposed 512K-line TypeScript source - YOLO classifier, 93% context compaction, prompt-cache layout, 88+ feature flags, the custom React-Fiber terminal renderer.
- Claudraband
Wraps the real Claude Code TUI with a session lifecycle layer. Resumable non-interactive workflows, HTTP daemon for remote/headless control, ACP server for editor integrations (Zed, Toad). Drives your existing Claude Code install rather than reimplementing it - keeps skills, hooks, MCPs, and approvals intact.
- mcptube
MCP server that turns YouTube videos into a persistent, merging wiki rather than ephemeral vector chunks. Scene-change frame extraction + vision analysis captures slides, code, and diagrams that transcripts miss. 25+ MCP tools, FTS5+LLM hybrid retrieval, version history with source attribution per claim.