Available API Models
All models accessible via the REST API (API key authentication). Prices and availability are fetched live from the backend.
How to use these models
Pass the Model ID as the model parameter in your API requests. See Quick Start below.
All API Models
Loading prices...
Pricing Notice
Pricing is subject to change at our discretion. Use the /v1/models endpoint for current pricing.
Model Categories
Reasoning & Problem Solving
qwen3-8b— Lightweight reasoning, extremely cost-efficient (CHF 0.04/1M input)API onlygpt-oss-120b— Reasoning & agentic tasks, great value (CHF 0.16/1M input)Alldeepseek-r1-70b— Mid-range reasoning with function callingAPI onlyqwq-32b— Balanced reasoning with 32K contextAPI onlyinference-glm45-air-110b— Hybrid reasoning, configurable thinking mode, strong tool useAPI onlydeepseek-r1-670b— Maximum reasoning capability, 65K contextAPI onlyinference-deepseek-v32— Extended reasoning with streamingAPI onlyQwen/Qwen3-VL-235B-A22B-Instruct— 262K context, function calling, reasoningAPI onlyinference-qwen3-vl-235b— Vision + reasoning + function callingAllmoonshotai/Kimi-K2.5— 256K context, full multimodal, agentic tasksAll
Vision & Document Analysis
granite-vision-2b— Ultra-lightweight vision (CHF 0.11/1M)API onlygemma-3n-e4b-it— Multimodal (text, image, audio, video), 140+ languagesAPI onlymistral-7b— Vision + function calling at low costAPI onlygemma-12b-it— Text and image input/outputAPI onlyqwen2.5-vl-72b— High-capacity vision with function callingAPI onlyllama-4-maverick— Multimodal with function callingAPI onlyinference-deepseek-ocr— Specialized OCR, table extractionAPI onlyinference-llama4-scout-17b— Vision support, web searchAll
Chat & Multilingual
apertus-8b— Swiss LLM, AI Act compliant, multilingualAPI onlyapertus-70b— Swiss LLM, larger variantAPI onlyswiss-ai/Apertus-70B-Instruct-2509— Swiss LLM, Infomaniak variantAllMistral-Small-3.2-24B-Instruct-2506— Fast, multilingual, 128K contextAllgranite-3-8b— Fast reasoning and instruction followingAPI onlygranite-3.1-8b— 131K context, function calling, 12 languagesAPI onlyllama-3.3-70b— 131K context, 30+ languagesAPI onlyllama3— 100K context, web search, programmingAPI onlyinference-kimi-k2— Extra large multilingual modelAPI only
Quick Start
curl https://backend.schatziai.ch/v1/chat/completions \
-H "Authorization: Bearer sk-your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-8b",
"messages": [{"role": "user", "content": "Explain quantum computing in simple terms."}]
}'
Listing Models Programmatically
curl https://backend.schatziai.ch/v1/models \
-H "Authorization: Bearer sk-your-api-key"
The /v1/models endpoint returns only models available to your authentication type. See API Endpoints for full reference.
Related Documentation
- API Overview — Getting started with the API
- Authentication — API key management
- API Endpoints — Full endpoint reference
- Code Examples — Integration examples
- Model Comparison — All models including Chat UI details
NOTE Pricing is subject to change at our discretion.