Choosing the Right Model
Selecting the optimal AI model depends on your specific task requirements, compliance needs, and budget. All Schatzi AI models run exclusively on Swiss infrastructure, ensuring your data never leaves Switzerland. This guide helps you match your use case to the right model.
Quick Decision Guide
"I just want to draft an email or LinkedIn post"
→ Chat, Multi-lingual, Coding & function calling - Small
This model provides a versatile balance of speed and capability, supporting multiple languages and function calling, making it ideal for routine professional writing and general chat applications.
"I need to ensure Swiss/EU compliance and transparency"
→ Apertus Swiss LLM - Large
Specifically designed for government agencies and organizations requiring strict data sovereignty, this Swiss LLM is fully compliant with the AI Act and respects privacy and intellectual property. With documented data and methods for unprecedented transparency, it delivers frontier-level performance.
"I need complex reasoning and agent workflows"
→ Reasoning & Agent tasks - Large
Optimized for powerful reasoning, agentic tasks, and versatile developer use cases, this model supports advanced function calling and reasoning capabilities. It is the ideal choice for automation workflows and sophisticated AI agents.
"I need up-to-date web search and current information"
→ Search, Chat & Analysis - Small
Optimized for web search and chat, this model is suitable for accessing current information and recent developments. It also supports vision capabilities, making it useful for content creation and storytelling.
"I need to analyze documents and images"
→ Chat & Document Analysis & Reasoning - Large
This large-scale model delivers frontier-level performance across a broad range of complex tasks. With advanced multilingual capabilities and vision support, it can analyze both text documents and images, with an optional reasoning mode to tailor responses to query complexity.
"I need maximum capability for everything"
→ Chat, Document Analysis & Agent tasks - Xtra Large
This very large-scale model combines vision, reasoning, function calling, and web search capabilities. It handles document analysis, agent tasks, and complex reasoning, dynamically tailoring responses to context and complexity. Use this when you need the highest capability across all modalities.
Model Capabilities at a Glance
Note that API-only models are accessible via the REST API but are not visible in the Chat UI.
| Model | Category | Vision | Reasoning | Web Search | Function Calling | Context Window | Availability |
|---|---|---|---|---|---|---|---|
| Apertus Swiss LLM - Large | Swiss LLM | No | No | No | No | 65,536 | Chat UI & API |
| Apertus Swiss LLM - Large | Swiss LLM | No | No | No | No | 65,536 | API only |
| Apertus Swiss LLM - Small | Swiss LLM | No | No | No | No | 65,536 | API only |
| Document Analysis - Medium | Document Analysis | Yes | No | No | Yes | 32,768 | API only |
| Document Analysis - Small | Document Analysis | Yes | No | No | Yes | 32,768 | API only |
| Document Analysis - Small | Document Analysis | Yes | No | No | No | 32,000 | API only |
| Document Analysis - Xtra Small | Document Analysis | Yes | No | No | No | 16,384 | API only |
| Fast Reasoning & Instruction Following - Small | Thinking | No | No | No | Yes | 32,768 | API only |
| Reasoning & Problem Solving - Small | Thinking | No | Yes | No | Yes | 32,768 | API only |
| Llama 3.3 Multi-lingual - Medium | Multi-lingual | No | No | No | No | 131,072 | API only |
| Llama 4 Maverick multi modal - Small | Document Analysis | Yes | No | No | Yes | 32,768 | API only |
| Reasoning & Agent tasks - Large | Agent | No | Yes | No | Yes | 65,536 | API only |
| Reasoning & Problem Solving - Medium | Thinking | No | Yes | No | Yes | 32,768 | API only |
| Reasoning & Problem Solving - Small | Thinking | No | Yes | No | Yes | 32,768 | API only |
| Reasoning & Problem Solving - Xtra Large | Thinking | No | Yes | No | Yes | 65,536 | API only |
| Chat & Function Calling - Small (Granite 3.1) | Function Calling | No | No | No | Yes | 131,072 | API only |
| Reasoning & Problem Solving - Xtra Large | Thinking | No | No | No | No | Not specified | API only |
| Reasoning & Tool Use - Large (GLM-4.5 Air) | Thinking | No | Yes | No | Yes | 131,072 | API only |
| Chat & Document Analysis - Xtra Xtra Large | Multi-lingual | No | No | No | No | Not specified | API only |
| Search, Chat & Analysis - Small | Web Search | Yes | No | Yes | No | Not specified | API only |
| Chat & Document Analysis & Reasoning - Large | Document Analysis | Yes | Yes | No | Yes | Not specified | API only |
| Document Analysis & OCR - Small (DeepSeek OCR) | Document Analysis | Yes | No | No | No | 8,192 | API only |
| Chat, Multi-lingual, Coding & function calling - Small | Coding | No | No | No | Yes | 128,000 | Chat UI & API |
| Chat, Document Analysis, Coding & Reasoning - Xtra Large | Thinking | Yes | Yes | No | Yes | 1,000,000 | Chat UI & API |
| Chat, Vision, Document Analysis & Reasoning - Medium | Thinking | Yes | Yes | No | Yes | 256,000 | Chat UI & API |
| inference-miner-u25 | Document Analysis | No | No | No | No | Not specified | API only |
| Chat, Document Analysis & Agent tasks - Xtra Large | Agent | Yes | Yes | Yes | Yes | 250,000 | Chat UI & API |
Pricing Overview
The following table provides a complete comparison of input and output pricing for all models. Prices are shown in CHF per million tokens.
Loading prices...
Pricing is subject to change at our discretion.
For more details on how tokens are calculated and billed, see Understanding Tokens.
General Tips
Start Simple
Begin with cost-effective models like Chat, Multi-lingual, Coding & function calling - Small for routine tasks. You can always switch to more powerful models if you need additional capabilities such as vision or reasoning. See our Lite Plan for economical entry-level access.
Match Compliance Requirements
For government agencies, healthcare, or any organization requiring strict data sovereignty and AI Act compliance, Apertus Swiss LLM - Large is the recommended choice. While all models run on Swiss infrastructure with data never leaving Switzerland, Apertus provides additional transparency and compliance documentation. Learn more about our Data Protection standards.
Mix and Match
It's efficient to use different models for different tasks throughout your workflow:
- Chat, Multi-lingual, Coding & function calling - Small for daily correspondence and quick questions.
- Reasoning & Agent tasks - Large for automation and complex analysis.
- Search, Chat & Analysis - Small for research requiring current web data.
Consider Context Window
Understanding context windows helps you select the right model for your document length:
- Short tasks (< 1 page): Any model works well.
- Medium tasks (1-20 pages): Chat, Multi-lingual, Coding & function calling - Small (128K context) or Apertus Swiss LLM - Large (65K context).
- Long tasks (20+ pages): Use models with large context windows like Chat, Vision, Document Analysis & Reasoning - Medium (256K context).
- Extremely long documents: Consider Chat, Document Analysis, Coding & Reasoning - Xtra Large for its 1,000,000 token capacity.
When to Upgrade / Downgrade
When to Upgrade
Consider switching to a more capable model when:
- Results lack depth: Move from Chat, Multi-lingual, Coding & function calling - Small to Chat & Document Analysis & Reasoning - Large for more sophisticated analysis.
- You need reasoning capabilities: Switch to Reasoning & Agent tasks - Large or Chat, Document Analysis & Agent tasks - Xtra Large when complex logic or agent workflows are required.
- Vision is needed: Upgrade to Search, Chat & Analysis - Small, Chat & Document Analysis & Reasoning - Large, or Chat, Document Analysis & Agent tasks - Xtra Large when analyzing images or charts.
- Maximum capability required: Use Chat, Document Analysis & Agent tasks - Xtra Large for tasks requiring the combination of reasoning, vision, function calling, and web search.
When to Downgrade
Consider switching to a simpler model when:
- Speed is priority: Use Chat, Multi-lingual, Coding & function calling - Small instead of larger models for faster response times in routine tasks.
- Cost optimization: For simple chat and coding tasks, Chat, Multi-lingual, Coding & function calling - Small offers excellent value with input pricing at ... per million tokens.
- Basic web search: Search, Chat & Analysis - Small is sufficient for web search and basic conversational needs without the overhead of maximum-scale models.
- No special features needed: If your task does not require vision, reasoning, or function calling, avoid paying premium rates for models that include these capabilities.
Build a personal model selection cheat sheet based on your recurring workflows. Most users find that 2-3 models cover 90% of their needs—typically Chat, Multi-lingual, Coding & function calling - Small for quick tasks, Reasoning & Agent tasks - Large for specialized work, and Apertus Swiss LLM - Large for any compliance-sensitive operations. Monitor your usage patterns in the dashboard to optimize your selection over time.