Skip to main content

Founder & Front-end Developer · 2024 — Present

PixCode

An OpenAI-compatible gateway routing a single API key to 300+ LLMs — GPT-5, Claude, Gemini, DeepSeek — with pay-as-you-go billing.

PixCode
300+
Supported models across all major providers
1
API key to access them all
$0
Minimum spend / base fees
100%
OpenAI SDK compatible (drop-in)

// Problem

Building LLM products means juggling keys for OpenAI, Anthropic, Google, DeepSeek, Mistral — each with their own SDK, rate limit, pricing, and reliability profile. Teams end up writing fallback logic they never wanted to own, or getting locked into one vendor's console. Indie devs and small teams in particular lack the leverage to get enterprise pricing anywhere.

// Approach

PixCode is a single drop-in OpenAI-compatible endpoint that fronts 300+ models. Point your existing openai SDK at our URL, swap the model string, pay one unified bill. Routing is rule-based (latency, cost, reliability) with automatic fallback if a provider returns errors. Usage is metered per request, streamed to Postgres, and settled against a Stripe prepaid balance — no subscriptions, no commits.

Context

was shipping side projects and every new LLM provider meant another dashboard, another invoice, another SDK. The idea: what if it all looked like OpenAI from the client's perspective, and the routing happened somewhere else?

What I built

A Next.js edge-runtime proxy that speaks the OpenAI `v1/chat/completions` dialect and translates, per request, to whatever provider the user selects. Streaming works end-to-end (SSE tunnelled through edge functions). Every request is logged to Postgres with token counts, latency, and cost; users get a live dashboard of spend. Prepaid billing runs on Stripe — top up any amount, pay cents per 1K tokens, no lock-in.

Decisions worth calling out

Edge runtime keeps latency under 150ms added overhead globally. The provider adapter layer is a thin set of TypeScript modules — adding a new model takes ~30 lines. Fallback chains are expressed declaratively ("prefer GPT-5, fall back to Sonnet 4.6 on error") and execute in-memory so a single user request never hits us twice. Rate limiting runs on Upstash Redis with per-key sliding windows.

Outcome

PixCode runs in production serving indie developers, agencies, and a handful of Chinese teams that can't reach OpenAI directly. Breaking even on infrastructure month two; SDK compatibility means onboarding a new customer is one `baseURL` change in their existing code.

Published April 15, 2026

GET IN TOUCH

Interested in collaboration?

© 2026 Felix Yu
DESIGNED + CODED