Skip to main content

Available API Models

All models accessible via the REST API (API key authentication). Prices and availability are fetched live from the backend.

How to use these models

Pass the Model ID as the model parameter in your API requests. See Quick Start below.

All API Models

Loading prices...

Pricing Notice

Pricing is subject to change at our discretion. Use the /v1/models endpoint for current pricing.

Model Categories

Reasoning & Problem Solving

  • qwen3-8b — Lightweight reasoning, extremely cost-efficient (CHF 0.04/1M input) API only
  • gpt-oss-120b — Reasoning & agentic tasks, great value (CHF 0.16/1M input) All
  • deepseek-r1-70b — Mid-range reasoning with function calling API only
  • qwq-32b — Balanced reasoning with 32K context API only
  • inference-glm45-air-110b — Hybrid reasoning, configurable thinking mode, strong tool use API only
  • deepseek-r1-670b — Maximum reasoning capability, 65K context API only
  • inference-deepseek-v32 — Extended reasoning with streaming API only
  • Qwen/Qwen3-VL-235B-A22B-Instruct — 262K context, function calling, reasoning API only
  • inference-qwen3-vl-235b — Vision + reasoning + function calling All
  • moonshotai/Kimi-K2.5 — 256K context, full multimodal, agentic tasks All

Vision & Document Analysis

  • granite-vision-2b — Ultra-lightweight vision (CHF 0.11/1M) API only
  • gemma-3n-e4b-it — Multimodal (text, image, audio, video), 140+ languages API only
  • mistral-7b — Vision + function calling at low cost API only
  • gemma-12b-it — Text and image input/output API only
  • qwen2.5-vl-72b — High-capacity vision with function calling API only
  • llama-4-maverick — Multimodal with function calling API only
  • inference-deepseek-ocr — Specialized OCR, table extraction API only
  • inference-llama4-scout-17b — Vision support, web search All

Chat & Multilingual

  • apertus-8b — Swiss LLM, AI Act compliant, multilingual API only
  • apertus-70b — Swiss LLM, larger variant API only
  • swiss-ai/Apertus-70B-Instruct-2509 — Swiss LLM, Infomaniak variant All
  • Mistral-Small-3.2-24B-Instruct-2506 — Fast, multilingual, 128K context All
  • granite-3-8b — Fast reasoning and instruction following API only
  • granite-3.1-8b — 131K context, function calling, 12 languages API only
  • llama-3.3-70b — 131K context, 30+ languages API only
  • llama3 — 100K context, web search, programming API only
  • inference-kimi-k2 — Extra large multilingual model API only

Quick Start

curl https://backend.schatziai.ch/v1/chat/completions \
-H "Authorization: Bearer sk-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-8b",
"messages": [{"role": "user", "content": "Explain quantum computing in simple terms."}]
}'

Listing Models Programmatically

curl https://backend.schatziai.ch/v1/models \
-H "Authorization: Bearer sk-your-api-key"

The /v1/models endpoint returns only models available to your authentication type. See API Endpoints for full reference.

NOTE Pricing is subject to change at our discretion.