Model Comparison

Schatzi AI provides access to 20 active state-of-the-art AI models. All models run exclusively on Swiss infrastructure, ensuring your data never leaves Switzerland and remains compliant with Swiss data protection regulations.

Complete Model Reference

Swiss LLM (AI Act Compliant)

Apertus Swiss LLM - Large

Context Window: 65,536 tokens
Streaming: Supported
Availability: Chat UI & API

Ideal for multilingual services, government agencies and R&D teams looking for a reliable, adaptable model ● Data and methods documented for unprecedented transparency ● Compliant with the AI Act and respectful of privacy and intellectual property ● A 70B version with performance on a par with current market leaders

Capabilities: Chat, Multi-lingual, Swiss LLM

Pricing:

Input: ... per million tokens
Output: ... per million tokens

Apertus Swiss LLM - Large (v1.5)

Context Window: 65,536 tokens
Streaming: Supported
Availability: API only

Apertus v1.5. Ideal for multilingual services, government agencies and R&D teams looking for a reliable, adaptable model ● Data and methods documented for unprecedented transparency ● Compliant with the AI Act and respectful of privacy and intellectual property ● A 70B version with performance on a par with current market leaders

Capabilities: Chat, Multi-lingual, Swiss LLM

Pricing:

Input: ... per million tokens
Output: ... per million tokens

Reasoning & Problem-Solving

Chat & Document Analysis & Reasoning - Large

Context Window: Not specified
Streaming: Supported
Availability: API only

large-scale model, rivalling leading models or leading models Opus across a broad range of complex tasks ● Advanced multilingual capabilities ● Reasoning mode can be enabled to dynamically tailor responses to the context and complexity of queries

Capabilities: Document Analysis, Chat, Vision, Reasoning, Function Calling

Pricing:

Input: ... per million tokens
Output: ... per million tokens

Fast Reasoning & Instruction Following - Small

Context Window: 32,768 tokens
Streaming: Supported
Availability: API only

Optimized for Reasoning and instruction-following capabilities

Capabilities: Thinking, Chat, Data Analysis, Function Calling

Pricing:

Input: ... per million tokens
Output: ... per million tokens

Reasoning & Problem Solving - Small

Context Window: 32,768 tokens
Streaming: Supported
Availability: API only

Optimized for thinking and reasoning

Capabilities: Thinking, Chat, Reasoning, Function Calling

Pricing:

Input: ... per million tokens
Output: ... per million tokens

Reasoning & Agent tasks - Large

Context Window: 65,536 tokens
Streaming: Supported
Availability: API only

Optimized for powerful reasoning, agentic tasks, and versatile developer use cases

Capabilities: Data Analysis, Chat, Thinking, Agent, Reasoning, Function Calling

Pricing:

Input: ... per million tokens
Output: ... per million tokens

Reasoning & Problem Solving - Medium

Context Window: 32,768 tokens
Streaming: Supported
Availability: API only

Optimized for thinking and reasoning

Capabilities: Thinking, Chat, Reasoning, Function Calling

Pricing:

Input: ... per million tokens
Output: ... per million tokens

Reasoning & Problem Solving - Xtra Large

Context Window: Not specified
Streaming: Supported
Availability: API only

Optimized for Reasoning chat completions. Reasoning model

Capabilities: Thinking, Chat

Pricing:

Input: ... per million tokens
Output: ... per million tokens

Reasoning & Tool Use - Large (GLM-4.5 Air)

Context Window: 131,072 tokens
Streaming: Supported
Availability: API only

ZhipuAI GLM-4.5-Air. Mixture-of-Experts model with 106B total / 12B active parameters. Hybrid reasoning with configurable thinking mode, strong tool/function calling and code generation capabilities. 128K context window.

Capabilities: Thinking, Chat, Function Calling, Reasoning

Pricing:

Input: ... per million tokens
Output: ... per million tokens

Chat, Document Analysis & Agent tasks - Xtra Large

Context Window: 250,000 tokens
Streaming: Supported
Availability: Chat UI & API

Very large-scale model, rivalling leading models or leading models Opus across a broad range of complex tasks ● Advanced multilingual capabilities ● Reasoning mode can be enabled to dynamically tailor responses to the context and complexity of queries ● Optimized for powerful reasoning, agentic tasks, and versatile developer use cases

Capabilities: Chat, Document Analysis, Agent, Coding, Thinking, Web Search, Vision, Reasoning, Function Calling

Pricing:

Input: ... per million tokens
Output: ... per million tokens

Chat, Document Analysis, Coding & Reasoning - Xtra Large

Context Window: 1,000,000 tokens
Streaming: Supported
Availability: Chat UI & API

Multi modal model, optimized for chat, document analysis, coding and reasoning.

Capabilities: Chat, Document Analysis, Coding, Thinking, Data Analysis, Vision, Reasoning, Function Calling

Pricing:

Input: ... per million tokens
Output: ... per million tokens

Chat, Vision, Document Analysis & Reasoning - Medium

Context Window: 256,000 tokens
Streaming: Supported
Availability: Chat UI & API

Best in class multi-modal model, optimized for chat, vision, document analysis, coding and reasoning.

Capabilities: Chat, Vision, Document Analysis, Coding, Thinking, Reasoning, Function Calling

Pricing:

Input: ... per million tokens
Output: ... per million tokens

Reasoning & Tool Use - Xtra Large (GLM-5.2)

Context Window: Not specified
Streaming: Supported
Availability: Chat UI & API

ZhipuAI GLM-5.2 Hybrid reasoning with configurable thinking mode, strong tool/function calling and code generation capabilities. Same pricing tier as GLM-5.1 on this provider; exact parameter count not published by the provider.

Capabilities: Thinking, Chat, Function Calling, Reasoning

Pricing:

Input: ... per million tokens
Output: ... per million tokens

Vision & Document Analysis

Document Analysis - Small

Context Window: 32,768 tokens
Streaming: Supported
Availability: API only

Optimized for multilingual dialogue use cases

Capabilities: Document Analysis, Chat, Vision, Function Calling

Pricing:

Input: ... per million tokens
Output: ... per million tokens

Document Analysis - Xtra Small

Context Window: 16,384 tokens
Streaming: Supported
Availability: API only

Optimized for compact and efficient vision-language model

Capabilities: Document Analysis, Chat, Vision

Pricing:

Input: ... per million tokens
Output: ... per million tokens

Context Window: 32,768 tokens
Streaming: Supported
Availability: API only

Optimized for text and multimodal experiences

Capabilities: Document Analysis, Chat, Vision, Function Calling

Pricing:

Input: ... per million tokens
Output: ... per million tokens

Document Analysis & OCR - Small (DeepSeek OCR)

Context Window: 8,192 tokens
Streaming: Supported
Availability: API only

DeepSeek OCR. 3B parameter vision-language model specialized for optical character recognition and document understanding. Excels at converting documents to structured text/markdown, table extraction, and mathematical content recognition.

Capabilities: Document Analysis, Vision

Pricing:

Input: ... per million tokens
Output: ... per million tokens

inference-miner-u25

Context Window: Not specified
Streaming: Supported
Availability: API only

Vision-language model optimized for document analysis and parsing.

Capabilities: Vision, Document Analysis

Pricing:

Input: ... per million tokens
Output: ... per million tokens

Multilingual

Chat, Multi-lingual, Coding & function calling - Small

Context Window: 128,000 tokens
Streaming: Supported
Availability: Chat UI & API

Mistral

Capabilities: Chat, Multi-lingual, Coding, Function Calling

Pricing:

Input: ... per million tokens
Output: ... per million tokens

Chat & General Purpose

Search, Chat & Analysis - Small

Context Window: Not specified
Streaming: Supported
Availability: API only

Optimized for web search and chat. Suitable for artists and content creation, including storytelling

Capabilities: Web Search, Chat, Vision

Pricing:

Input: ... per million tokens
Output: ... per million tokens

Complete Pricing Overview

Loading prices...

Pricing Notice

Pricing is subject to change at our discretion.

Cost Estimation

Task Type	Typical Token Usage	Recommended Model
Email or message draft	500 input + 300 output	Apertus Swiss LLM - Large
10-page document summary	10K input + 1K output	Chat & Document Analysis & Reasoning - Large
Contract analysis (30 pages)	30K input + 2K output	Apertus Swiss LLM - Large
Complex reasoning task	5K input + 3K output	Chat & Document Analysis & Reasoning - Large
Multilingual exchange	15K input + 10K output	Apertus Swiss LLM - Large

Model Availability

Schatzi AI offers models across two channels:

Chat UI (via chat.schatziai.ch): models available in the Chat UI & API
API (via REST API with API keys): models available via API

Chat UI Models

Apertus Swiss LLM - Large
Chat, Document Analysis & Agent tasks - Xtra Large
Chat, Multi-lingual, Coding & function calling - Small
Chat, Document Analysis, Coding & Reasoning - Xtra Large
Chat, Vision, Document Analysis & Reasoning - Medium
Reasoning & Tool Use - Xtra Large (GLM-5.2)

API-Only Models

These models are only accessible via API key and are not visible in the Chat UI. See Available API Models for the full API reference.

Chat & Document Analysis & Reasoning - Large
Document Analysis - Small
Document Analysis - Xtra Small
Fast Reasoning & Instruction Following - Small
Reasoning & Problem Solving - Small
Llama 4 Maverick multi modal - Small
Reasoning & Agent tasks - Large
Reasoning & Problem Solving - Medium
Reasoning & Problem Solving - Xtra Large
Reasoning & Tool Use - Large (GLM-4.5 Air)
Search, Chat & Analysis - Small
Document Analysis & OCR - Small (DeepSeek OCR)
inference-miner-u25
Apertus Swiss LLM - Large (v1.5)

Decommissioned Models

These models are no longer available and are listed for historical reference.

Chat & Document Analysis - Medium — decommissioned on 2026-06-01, replaced by Chat, Multi-lingual, Coding & function calling - Small
Search, Chat & Analysis - Large — decommissioned on 2026-06-01, replaced by Chat, Document Analysis, Coding & Reasoning - Xtra Large
Apertus Swiss LLM - Small — decommissioned on 2026-05-28, replaced by Apertus Swiss LLM - Large
Document Analysis - Medium — decommissioned on 2026-05-28, replaced by Chat & Document Analysis & Reasoning - Large
Llama 3.3 Multi-lingual - Medium — decommissioned on 2026-05-28, replaced by Llama 4 Maverick multi modal - Small
Chat & Vision - Small (Gemma 3n) — decommissioned on 2026-06-01, replaced by Chat, Vision, Document Analysis, Coding & Reasoning - Medium (Gemma 4)
Chat & Function Calling - Small (Granite 3.1) — decommissioned on 2026-06-01, replaced by Chat, Multi-lingual, Coding & function calling - Small
Reasoning & Agentic - Large (GPT-OSS 120B) — decommissioned on 2026-05-11, replaced by Chat, Document Analysis, Coding & Reasoning - Xtra Large
Embedding - Multilingual (BGE-M3) — decommissioned
Embedding - Multilingual (Granite 278M) — decommissioned
Chat & Document Analysis - Xtra Xtra Large — decommissioned on 2026-05-28
inference-bge-reranker — decommissioned

FAQ

Q: Can I switch models mid-conversation? A: Yes. You can change models at any time; the conversation context carries over, though very long contexts may be truncated on a model with a smaller context window.

Q: Which model is best for Swiss compliance-sensitive work? A: Apertus Swiss LLM - Large runs on Swiss infrastructure and is optimised for data-sovereignty-sensitive use cases.

Q: Do prices update automatically? A: Yes. Prices shown here are fetched live and always reflect current pricing.

Complete Model Reference​

Swiss LLM (AI Act Compliant)​

Apertus Swiss LLM - Large​

Apertus Swiss LLM - Large (v1.5)​

Reasoning & Problem-Solving​

Chat & Document Analysis & Reasoning - Large​

Fast Reasoning & Instruction Following - Small​

Reasoning & Problem Solving - Small​

Reasoning & Agent tasks - Large​

Reasoning & Problem Solving - Medium​

Reasoning & Problem Solving - Xtra Large​

Reasoning & Tool Use - Large (GLM-4.5 Air)​

Chat, Document Analysis & Agent tasks - Xtra Large​

Chat, Document Analysis, Coding & Reasoning - Xtra Large​

Chat, Vision, Document Analysis & Reasoning - Medium​

Reasoning & Tool Use - Xtra Large (GLM-5.2)​

Vision & Document Analysis​

Document Analysis - Small​

Document Analysis - Xtra Small​

Llama 4 Maverick multi modal - Small​

Document Analysis & OCR - Small (DeepSeek OCR)​

inference-miner-u25​

Multilingual​

Chat, Multi-lingual, Coding & function calling - Small​

Chat & General Purpose​

Search, Chat & Analysis - Small​

Complete Pricing Overview​

Cost Estimation​

Model Availability​

Chat UI Models​

API-Only Models​

Decommissioned Models​

Related Documentation​

FAQ​

Complete Model Reference

Swiss LLM (AI Act Compliant)

Apertus Swiss LLM - Large

Apertus Swiss LLM - Large (v1.5)

Reasoning & Problem-Solving

Chat & Document Analysis & Reasoning - Large

Fast Reasoning & Instruction Following - Small

Reasoning & Problem Solving - Small

Reasoning & Agent tasks - Large

Reasoning & Problem Solving - Medium

Reasoning & Problem Solving - Xtra Large

Reasoning & Tool Use - Large (GLM-4.5 Air)

Chat, Document Analysis & Agent tasks - Xtra Large

Chat, Document Analysis, Coding & Reasoning - Xtra Large

Chat, Vision, Document Analysis & Reasoning - Medium

Reasoning & Tool Use - Xtra Large (GLM-5.2)

Vision & Document Analysis

Document Analysis - Small

Document Analysis - Xtra Small

Llama 4 Maverick multi modal - Small

Document Analysis & OCR - Small (DeepSeek OCR)

inference-miner-u25

Multilingual

Chat, Multi-lingual, Coding & function calling - Small

Chat & General Purpose

Search, Chat & Analysis - Small

Complete Pricing Overview

Cost Estimation

Model Availability

Chat UI Models

API-Only Models

Decommissioned Models

Related Documentation

FAQ