Skip to main content

Choosing the Right Model

Selecting the optimal AI model depends on your specific task requirements, compliance needs, and budget. All Schatzi AI models run exclusively on Swiss infrastructure, ensuring your data never leaves Switzerland. This guide helps you match your use case to the right model.

Quick Decision Guide

"I just want to draft an email or LinkedIn post"

→ Chat, Multi-lingual, Coding & function calling - Small

This model provides a versatile balance of speed and capability, supporting multiple languages and function calling, making it ideal for routine professional writing and general chat applications.

"I need to ensure Swiss/EU compliance and transparency"

→ Apertus Swiss LLM - Large

Specifically designed for government agencies and organizations requiring strict data sovereignty, this Swiss LLM is fully compliant with the AI Act and respects privacy and intellectual property. With documented data and methods for unprecedented transparency, it delivers frontier-level performance.

"I need complex reasoning and agent workflows"

→ Reasoning & Agent tasks - Large

Optimized for powerful reasoning, agentic tasks, and versatile developer use cases, this model supports advanced function calling and reasoning capabilities. It is the ideal choice for automation workflows and sophisticated AI agents.

"I need up-to-date web search and current information"

→ Search, Chat & Analysis - Small

Optimized for web search and chat, this model is suitable for accessing current information and recent developments. It also supports vision capabilities, making it useful for content creation and storytelling.

"I need to analyze documents and images"

→ Chat & Document Analysis & Reasoning - Large

This large-scale model delivers frontier-level performance across a broad range of complex tasks. With advanced multilingual capabilities and vision support, it can analyze both text documents and images, with an optional reasoning mode to tailor responses to query complexity.

"I need maximum capability for everything"

→ Chat, Document Analysis & Agent tasks - Xtra Large

This very large-scale model combines vision, reasoning, function calling, and web search capabilities. It handles document analysis, agent tasks, and complex reasoning, dynamically tailoring responses to context and complexity. Use this when you need the highest capability across all modalities.

Model Capabilities at a Glance

Note that API-only models are accessible via the REST API but are not visible in the Chat UI.

ModelCategoryVisionReasoningWeb SearchFunction CallingContext WindowAvailability
Apertus Swiss LLM - LargeSwiss LLMNoNoNoNo65,536Chat UI & API
Apertus Swiss LLM - LargeSwiss LLMNoNoNoNo65,536API only
Apertus Swiss LLM - SmallSwiss LLMNoNoNoNo65,536API only
Document Analysis - MediumDocument AnalysisYesNoNoYes32,768API only
Document Analysis - SmallDocument AnalysisYesNoNoYes32,768API only
Document Analysis - SmallDocument AnalysisYesNoNoNo32,000API only
Document Analysis - Xtra SmallDocument AnalysisYesNoNoNo16,384API only
Fast Reasoning & Instruction Following - SmallThinkingNoNoNoYes32,768API only
Reasoning & Problem Solving - SmallThinkingNoYesNoYes32,768API only
Llama 3.3 Multi-lingual - MediumMulti-lingualNoNoNoNo131,072API only
Llama 4 Maverick multi modal - SmallDocument AnalysisYesNoNoYes32,768API only
Reasoning & Agent tasks - LargeAgentNoYesNoYes65,536API only
Reasoning & Problem Solving - MediumThinkingNoYesNoYes32,768API only
Reasoning & Problem Solving - SmallThinkingNoYesNoYes32,768API only
Reasoning & Problem Solving - Xtra LargeThinkingNoYesNoYes65,536API only
Chat & Function Calling - Small (Granite 3.1)Function CallingNoNoNoYes131,072API only
Reasoning & Problem Solving - Xtra LargeThinkingNoNoNoNoNot specifiedAPI only
Reasoning & Tool Use - Large (GLM-4.5 Air)ThinkingNoYesNoYes131,072API only
Chat & Document Analysis - Xtra Xtra LargeMulti-lingualNoNoNoNoNot specifiedAPI only
Search, Chat & Analysis - SmallWeb SearchYesNoYesNoNot specifiedAPI only
Chat & Document Analysis & Reasoning - LargeDocument AnalysisYesYesNoYesNot specifiedAPI only
Document Analysis & OCR - Small (DeepSeek OCR)Document AnalysisYesNoNoNo8,192API only
Chat, Multi-lingual, Coding & function calling - SmallCodingNoNoNoYes128,000Chat UI & API
Chat, Document Analysis, Coding & Reasoning - Xtra LargeThinkingYesYesNoYes1,000,000Chat UI & API
Chat, Vision, Document Analysis & Reasoning - MediumThinkingYesYesNoYes256,000Chat UI & API
inference-miner-u25Document AnalysisNoNoNoNoNot specifiedAPI only
Chat, Document Analysis & Agent tasks - Xtra LargeAgentYesYesYesYes250,000Chat UI & API

Pricing Overview

The following table provides a complete comparison of input and output pricing for all models. Prices are shown in CHF per million tokens.

Loading prices...

Pricing Notice

Pricing is subject to change at our discretion.

For more details on how tokens are calculated and billed, see Understanding Tokens.

General Tips

Start Simple

Begin with cost-effective models like Chat, Multi-lingual, Coding & function calling - Small for routine tasks. You can always switch to more powerful models if you need additional capabilities such as vision or reasoning. See our Lite Plan for economical entry-level access.

Match Compliance Requirements

For government agencies, healthcare, or any organization requiring strict data sovereignty and AI Act compliance, Apertus Swiss LLM - Large is the recommended choice. While all models run on Swiss infrastructure with data never leaving Switzerland, Apertus provides additional transparency and compliance documentation. Learn more about our Data Protection standards.

Mix and Match

It's efficient to use different models for different tasks throughout your workflow:

  • Chat, Multi-lingual, Coding & function calling - Small for daily correspondence and quick questions.
  • Reasoning & Agent tasks - Large for automation and complex analysis.
  • Search, Chat & Analysis - Small for research requiring current web data.

Consider Context Window

Understanding context windows helps you select the right model for your document length:

  • Short tasks (< 1 page): Any model works well.
  • Medium tasks (1-20 pages): Chat, Multi-lingual, Coding & function calling - Small (128K context) or Apertus Swiss LLM - Large (65K context).
  • Long tasks (20+ pages): Use models with large context windows like Chat, Vision, Document Analysis & Reasoning - Medium (256K context).
  • Extremely long documents: Consider Chat, Document Analysis, Coding & Reasoning - Xtra Large for its 1,000,000 token capacity.

When to Upgrade / Downgrade

When to Upgrade

Consider switching to a more capable model when:

  • Results lack depth: Move from Chat, Multi-lingual, Coding & function calling - Small to Chat & Document Analysis & Reasoning - Large for more sophisticated analysis.
  • You need reasoning capabilities: Switch to Reasoning & Agent tasks - Large or Chat, Document Analysis & Agent tasks - Xtra Large when complex logic or agent workflows are required.
  • Vision is needed: Upgrade to Search, Chat & Analysis - Small, Chat & Document Analysis & Reasoning - Large, or Chat, Document Analysis & Agent tasks - Xtra Large when analyzing images or charts.
  • Maximum capability required: Use Chat, Document Analysis & Agent tasks - Xtra Large for tasks requiring the combination of reasoning, vision, function calling, and web search.

When to Downgrade

Consider switching to a simpler model when:

  • Speed is priority: Use Chat, Multi-lingual, Coding & function calling - Small instead of larger models for faster response times in routine tasks.
  • Cost optimization: For simple chat and coding tasks, Chat, Multi-lingual, Coding & function calling - Small offers excellent value with input pricing at ... per million tokens.
  • Basic web search: Search, Chat & Analysis - Small is sufficient for web search and basic conversational needs without the overhead of maximum-scale models.
  • No special features needed: If your task does not require vision, reasoning, or function calling, avoid paying premium rates for models that include these capabilities.
Pro Tip

Build a personal model selection cheat sheet based on your recurring workflows. Most users find that 2-3 models cover 90% of their needs—typically Chat, Multi-lingual, Coding & function calling - Small for quick tasks, Reasoning & Agent tasks - Large for specialized work, and Apertus Swiss LLM - Large for any compliance-sensitive operations. Monitor your usage patterns in the dashboard to optimize your selection over time.