OBTO
Radical Transparency

Two dimensions.
Zero surprises.

Pay a flat base fee for the platform, then only for what your agents actually consume. Bring your own AI client and pay per application — not per token.

How billing works

1
Using OBTO inference

Run AI workflows through OBTO's hosted models. You pay a base platform fee that includes a monthly token allowance, then utility billing on usage above the threshold. Powered by Groq GPT-OSS 120B — $0.15 input / $0.60 output per million tokens, passed through at cost + 20%.

Your rate: $0.18 / M input · $0.72 / M output · Cached input: $0.09 / M
2
Bring your own AI client

Connect Claude Desktop, ChatGPT, Cursor, or any MCP-compatible client to your OBTO endpoint. You keep your existing LLM subscription — we charge per deployed application, not per token. Each app gets its own MCP server, rate limits, and Glass Box audit trail.

Clients supported: Claude Desktop · ChatGPT · Cursor · VS Code · any MCP client

Track 1: Rent the Engine

The raw Glass Box platform. Start free, scale predictably on utility billing.

Builder
Prove the concept
$0/mo
No credit card required
Applications
  • 1 application
  • 2,000 API calls / week
  • 1 MCP server endpoint
Inference
  • 1M tokens / mo included
  • No overage (upgrade to add)
Platform
  • Basic Glass Box tracing
  • 1 team member
  • Community support
Start free
Most popular
Team
Scale your agents
$49/mo base
+ metered usage above threshold
Applications
  • 5 applications included
  • +$8 / additional app / mo
  • 50,000 API calls / week / app
Inference
  • 5M tokens / mo included
  • +$0.38 / M tokens over limit
Platform
  • Standard Glass Box tracing
  • SSO, roles & audit trails
  • Up to 3 team members
  • Email support
Get Team
Business
Production at scale
$149/mo base
+ metered usage above threshold
Applications
  • 20 applications included
  • +$6 / additional app / mo
  • Unlimited API calls (fair use)
Inference
  • 25M tokens / mo included
  • +$0.35 / M tokens over limit
Platform
  • Advanced Glass Box tracing
  • SSO, roles & audit trails
  • Up to 15 team members
  • Priority support + SLA
Get Business
Enterprise
Total architectural control
Custom/mo
Volume discounts available
Applications
  • Unlimited applications
  • Unlimited API calls
Inference
  • Negotiated token pricing
  • Bring your own models (BYOM)
Platform
  • Full Glass Box audit suite
  • Self-hosted on your K8s
  • Custom MCP integrations
  • Unlimited team members
  • Dedicated support + SLA
Contact sales
Inference rate card — Groq GPT-OSS 120B + 20% platform fee
$0.18
per M input tokens
$0.72
per M output tokens
$0.09
per M cached input tokens

No hidden markups beyond the 20% platform fee. You can verify the base rate at groq.com/pricing at any time. That's what Glass Box means.

BYO AI Client

Track 2: Bring Your Own Model

Already paying for Claude, ChatGPT, or Cursor? Connect them to OBTO's MCP endpoint and use OBTO as your deployment infrastructure — not your inference provider. You keep your existing AI subscription. We charge per application hosted.

How it works
  1. 1. Sign in to OBTO and get your personal MCP endpoint URL
  2. 2. Add it to your AI client (Claude Desktop, Cursor, VS Code, etc.)
  3. 3. Ask your AI to build and deploy — OBTO handles the infrastructure
  4. 4. Your app goes live at yourapp.obto.co
What you pay

Applications count against your plan's app limit. Rate limits apply per app per week. No token charges — OBTO doesn't touch your LLM inference.

Builder (free) 1 app · 2K calls/wk
Team 5 apps · 50K calls/wk
Business 20 apps · unlimited
Enterprise Unlimited · custom
Compatible clients
  • Claude Desktop & Claude Web
  • ChatGPT (with MCP support)
  • Cursor & VS Code (Copilot)
  • OpenAI Codex CLI
  • Any MCP-compatible client
Get your MCP endpoint →
Co-Building Services

Track 3: Hire the Drivers

Most dev agencies charge you for 40 hours of manual coding. We don't. Our team builds using the OBTO AI platform — so we execute 10x faster. You pay for expert architecture and rapid assembly, not slow typing.

The Sprint Block (Weekly)

Perfect for launching an MVP. Buy a dedicated block of hours where our team sits with you, uses our platform, and ships your app in days — not months.

Calculate my Sprint →
🤝

The Co-Pilot (Monthly Retainer)

A dedicated OBTO expert acts as your fractional CTO — helping you refine workflows, integrate complex APIs, and build MCP servers behind the scenes.

Discuss a Retainer →

Frequently asked questions

What counts as an "application"?

One deployed OBTO app with its own MCP endpoint, domain, and backend. Each app can have multiple pages, routes, and server scripts. The limit is per deployed app, not per page or feature.

How does the base + metered model work?

You pay the flat base fee regardless of usage — that covers the platform, your included token allowance, and your app slots. If you consume more tokens than your plan includes, you're billed at the overage rate at end of month. No surprise spikes — your Glass Receipt shows usage in real-time.

If I use my own Claude or ChatGPT, do I pay for tokens?

No. When you connect a BYO AI client via MCP, OBTO doesn't route your inference — your client talks to its own LLM provider. OBTO only charges for the application hosting, rate limits, and infrastructure. Your token bill stays with Anthropic, OpenAI, or whoever you're using.

Why Groq GPT-OSS 120B specifically?

It's the best price-performance model for agentic workloads right now — 500 tokens/sec, 128K context, full tool-calling support, and the lowest cost per capable token on the market. We pass the rate through at cost + 20% with no hidden markup. You can verify the base rate at groq.com/pricing.

Can I self-host to avoid the platform fee entirely?

Yes — that's what Enterprise is for. You run the entire OBTO runtime on your own Kubernetes cluster. You still need an Enterprise agreement for support and updates, but you're not paying per-token or per-app to us.

What's the API call rate limit exactly?

Rate limits apply per application per week — not per user or per account. Builder: 2,000 calls/week. Team: 50,000 calls/week per app. Business: fair use (no hard limit, but we reserve the right to throttle runaway loads). Enterprise: unlimited with dedicated infrastructure.

Ready to kill the SaaS tax?

Start building on the platform yourself, or let our experts help you map out your first architecture sprint.