Token-as-a-Service is the sovereign LLM inference platform by CloudSigma. Access 33 AI models — chat, vision, code, audio, embeddings — through a single OpenAI-compatible API. Data never leaves your jurisdiction.
A complete AI inference platform — not just a proxy. Multi-tenant, white-label ready, with billing, admin controls, and sovereign infrastructure built in.
Data never leaves your jurisdiction. Run AI inference on CloudSigma's sovereign cloud infrastructure — GDPR compliant, ISO certified, independent from US hyperscalers.
From GPT-5.4 Codex and Claude Opus 4.6 to DeepSeek, Qwen, GLM, Kimi, MiniMax — switch models by changing a string. No separate accounts, no SDK changes.
Drop-in replacement for any OpenAI integration. Streaming, function calling, vision, extended thinking — all standard. Works with every SDK and framework.
Speech-to-text (Whisper), text-to-speech (Kokoro, F5), speaker identification, diarization, and audio understanding — all through the same API.
Per-token metering with real-time cost tracking in every response. Auto-topup, credit limits, monthly budgets, CloudSigma account linking, and Stripe integration.
Multi-tenant with organisations, domain-based model control, admin panel, user management, and invite system. Resell AI under your brand with zero infra work.
From $0.06/M tokens for high-speed routing to $75/M for frontier reasoning. Mix and match across providers — one bill, one key.
| Model | Input / 1M | Output / 1M | Context | Capabilities |
|---|---|---|---|---|
| claude-opus-4.6 | $15.00 | $75.00 | 200K | Vision Thinking |
| claude-opus-4 | $15.00 | $75.00 | 200K | Vision Thinking |
| claude-sonnet-4.6 | $3.00 | $15.00 | 200K | Vision Thinking |
| claude-sonnet-4 | $3.00 | $15.00 | 200K | Vision Thinking |
| gpt-5.4-codex | $2.50 | $15.00 | 1050K | New |
| gpt-5.3-codex | $1.75 | $14.00 | 400K | |
| gpt-5.2-codex | $1.75 | $14.00 | 256K | |
| glm-5 | $0.80 | $2.56 | 203K | |
| minimax-m2.5 | $0.30 | $1.20 | 197K | |
| minimax-m2 | $0.30 | $1.20 | 197K | |
| kimi-k2 | $0.20 | $0.40 | 131K | |
| qwen3-vl | $0.15 | $0.60 | 262K | Vision |
| deepseek-chat | $0.14 | $0.28 | 64K | |
| deepseek-v3 | $0.14 | $0.28 | 64K | |
| qwen-72b | $0.12 | $0.39 | 33K | |
| qwen3-30b | $0.10 | $0.30 | 131K | New |
| qwen-coder-32b | $0.08 | $0.28 | 41K | |
| glm-4-flash | $0.06 | $0.40 | 203K | |
| deepseek-r1-7b | — | — | 64K | Reasoning |
| Model | Type | Use Case |
|---|---|---|
| whisper / whisper-1 | Speech-to-Text | Transcription, subtitles, voice input |
| kokoro | Text-to-Speech | Natural voice generation |
| f5-tts | Text-to-Speech | Voice cloning & synthesis |
| bge-m3 | Embeddings (1024d) | Semantic search, RAG pipelines |
| bge-reranker-v2-m3 | Reranking | Search result re-scoring |
| ecapa-tdnn, cam++, resnet293, xvector, wavlm-base-plus-sv | Speaker ID | Voiceprint matching & verification |
| clap, ast, mert | Audio Understanding | Audio classification, music analysis |
Combine chat, voice, embeddings, and understanding models — no third-party APIs needed.
Voice AI agent. Call center automation. Podcast intelligence. Medical transcription. Legal document analysis. Multilingual customer support. All from one API.
One gateway. Multiple providers. Transparent routing, metering, and billing.
From government compliance to startup MVPs — TaaS adapts to your requirements.
Deploy AI assistants that process citizen data within national borders. GDPR-compliant, no US hyperscaler dependency, full audit trail.
Analyze contracts, regulatory filings, and financial reports. Embed, search, and reason over sensitive data — all in-jurisdiction.
Offer AI APIs under your brand. Set per-customer budgets, control model access by domain, earn revenue share — zero infrastructure investment.
Compare DeepSeek ($0.14/M) vs. Claude ($3/M) vs. GPT ($1.75/M) with identical API calls. Optimize quality vs. cost without managing multiple accounts.
Transcribe consultations, diarize doctor vs. patient, generate structured clinical notes. Biometric voice data stays within sovereign jurisdiction.
Give researchers and students access to frontier AI. Set monthly budgets per department, restrict model access by role, track usage in real time.
No subscriptions. No commitments. Token-level metering with transparent per-model pricing.
Sign up in 30 seconds. Get your API key. Ship.