104+ MODELS · 100% AUSTRALIAN COMPUTE
100+ Models. 100% Australian Compute.
Open-weight families served on Australian GPU pools. Frontier provider catalogue available through the same sovereign control plane. Tagged so you choose with eyes open.
100+ MODELS · ONE SOVEREIGN STACK
Every model. One Australian API.
Browse by capability, family, or use case. Sovereign-hosted weights run in Australian regions. Non-sovereign API providers are tagged so you choose with eyes open.
104+ models available
79 sovereign-hosted in AU
22 model families
Text generation
74 Llama 4 Maverick ★ Flagship Llama 4 MoE. 400B total / 17B active. Vision + 1M context.
400B / 17B active
1M
Llama Community
Text generationVision
Sovereign · AU
Llama 4 Scout 10M context window. Built for long-document RAG and summarisation.
109B / 17B active
10M
Llama Community
Text generationVision
Sovereign · AU
Llama 3.3 70B Instruct Strong all-rounder Llama 3.3 70B. Function calling, multilingual.
70B
128K
Llama Community
Text generation
Sovereign · AU
Llama 3.2 11B Vision Compact vision-capable Llama. Image + text understanding.
11B
128K
Llama Community
Text generationVision
Sovereign · AU
Llama 3.2 90B Vision Large-format vision Llama for high-fidelity image reasoning.
90B
128K
Llama Community
Text generationVision
Sovereign · AU
Llama 3.2 1B Instruct Edge-deployable Llama 3.2 for on-prem or constrained workloads.
1B
128K
Llama Community
Text generation
Sovereign · AU
Llama 3.2 3B Instruct 3B-parameter Llama for cost-efficient general inference.
3B
128K
Llama Community
Text generation
Sovereign · AU
Llama 3.1 8B Instruct Workhorse Llama 3.1 8B. Strong instruction-following.
8B
128K
Llama Community
Text generation
Sovereign · AU
Llama 3.1 405B Instruct Dense 405B Llama for heavyweight reasoning workloads.
405B
128K
Llama Community
Text generationReasoning
Sovereign · AU
Code Llama 70B Code Llama 70B. Fill-in-middle, repo-scale completion.
70B
100K
Llama Community
CodeText generation
Sovereign · AU
Llama Guard 3 8B Content-safety classifier from Meta. Pre/post-filter for chat.
8B
128K
Llama Community
Safety / moderationText generation
Sovereign · AU
Qwen 3.5 397B ★ Flagship Qwen 3.5 MoE. Tops GPQA Diamond among open weights.
397B / 17B active
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Qwen 3.5 72B Instruct Dense 72B Qwen. Strong reasoning, multilingual.
72B
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Qwen 3.5 32B Instruct Mid-size Qwen 3.5 for balanced cost vs capability.
32B
128K
Apache 2.0
Text generation
Sovereign · AU
Qwen 3.5 14B Instruct Cost-efficient Qwen for high-throughput inference.
14B
128K
Apache 2.0
Text generation
Sovereign · AU
Qwen 3.5 7B Instruct Compact Qwen 7B. Fast, multilingual, easy to fine-tune.
7B
128K
Apache 2.0
Text generation
Sovereign · AU
Qwen3-Coder 480B ★ Best-in-class open coding model. Repo-scale agent workflows.
480B / 35B active
1M
Apache 2.0
CodeText generation
Sovereign · AU
Qwen2.5-VL 72B Vision-language Qwen. Document, chart, screen understanding.
72B
128K
Apache 2.0
VisionText generation
Sovereign · AU
Qwen2.5-VL 7B Compact Qwen-VL for high-throughput vision pipelines.
7B
128K
Apache 2.0
VisionText generation
Sovereign · AU
QwQ 32B Reasoning Chain-of-thought reasoning Qwen. Strong on maths + logic.
32B
128K
Apache 2.0
ReasoningText generation
Sovereign · AU
Qwen2.5-Math 72B Math-specialist Qwen. Symbolic reasoning, proofs, MATH benchmark.
72B
128K
Apache 2.0
ReasoningText generation
Sovereign · AU
DeepSeek V4 Pro ★ Frontier open-weight. 80.6 SWE-Bench Verified, 90.1 GPQA Diamond.
1.6T / 49B active
1M
MIT
Text generationReasoning
Sovereign · AU
DeepSeek V4 Flash Low-latency DeepSeek V4. Most of the smarts at a fraction of the cost.
37B / 6B active
256K
MIT
Text generationCode
Sovereign · AU
DeepSeek V3.5 Production workhorse before V4 Pro. Excellent code generation.
671B / 37B active
128K
MIT
Text generationCode
Sovereign · AU
DeepSeek R1 ★ Open reasoning model. Visible chain-of-thought traces.
671B / 37B active
128K
MIT
ReasoningText generation
Sovereign · AU
DeepSeek R1 Distill Llama 70B R1 reasoning distilled into Llama 70B. Production-friendly latency.
70B
128K
MIT
ReasoningText generation
Sovereign · AU
DeepSeek R1 Distill Qwen 32B R1 reasoning distilled into Qwen 32B. Strong cost-performance.
32B
128K
MIT
ReasoningText generation
Sovereign · AU
DeepSeek-VL2 DeepSeek vision-language. OCR + chart + screen reasoning.
27B / 4.5B active
128K
MIT
VisionText generation
Sovereign · AU
Mistral Large 3 ★ Mistral's flagship open MoE. Strong multilingual, function-calling.
675B / 41B active
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Mistral Medium 3.5 Mid-tier Mistral. Strong cost-to-quality ratio for production.
70B
128K
Apache 2.0
Text generation
Sovereign · AU
Mistral Small 3.1 Small Mistral. Edge-deployable, multilingual, fast.
24B
128K
Apache 2.0
Text generation
Sovereign · AU
Mistral Nemo 12B Joint Mistral × NVIDIA. Strong multilingual, FP8-friendly.
12B
128K
Apache 2.0
Text generation
Sovereign · AU
Mixtral 8x22B Mixtral 8x22B MoE. Cost-efficient inference at scale.
141B / 39B active
64K
Apache 2.0
Text generation
Sovereign · AU
Mixtral 8x7B Original Mixtral MoE. Still strong for general inference.
47B / 13B active
32K
Apache 2.0
Text generation
Sovereign · AU
Pixtral 12B Mistral's open vision-language model. Document + UI reasoning.
12B
128K
Apache 2.0
VisionText generation
Sovereign · AU
Pixtral Large 124B Heavyweight Pixtral. State-of-the-art open multimodal.
124B
128K
Apache 2.0
VisionText generation
Sovereign · AU
Mistral Saba 24B Arabic-specialist Mistral. Strong MENA-region language performance.
24B
32K
Apache 2.0
Text generation
Sovereign · AU
Gemma 4 31B ★ Largest open Gemma 4. Strong reasoning, function-calling.
31B
256K
Gemma
Text generationReasoning
Sovereign · AU
Gemma 4 9B Mid-size Gemma 4. Best 9B-class open model on multilingual tasks.
9B
128K
Gemma
Text generation
Sovereign · AU
Gemma 4 2B Tiny Gemma 4. Edge / on-device deployments.
2B
128K
Gemma
Text generation
Sovereign · AU
Gemma 2 27B Stable Gemma 2 release. Still strong for general production.
27B
8K
Gemma
Text generation
Sovereign · AU
PaliGemma 2 28B Google's open VLM line. Document, screen, chart understanding.
28B
8K
Gemma
VisionText generation
Sovereign · AU
Gemini 2.5 Pro Gemini 2.5 Pro via Google API. 2M context, strong vision + reasoning.
,
2M
Proprietary
Text generationVision
Non-sovereign
Gemini 2.5 Flash Low-latency Gemini. Cost-efficient general inference.
,
1M
Proprietary
Text generationVision
Non-sovereign
Gemini 3.1 Pro Preview Google's next-gen Gemini in preview. Reasoning + visual leadership.
,
2M
Proprietary
Text generationVision
Non-sovereign
Claude Opus 4.6 ★ Anthropic's frontier model. Industry-leading on agentic tasks.
,
500K
Proprietary
Text generationReasoning
Non-sovereign
Claude Sonnet 4.6 Production workhorse Claude. Best quality-per-dollar for most workloads.
,
500K
Proprietary
Text generationReasoning
Non-sovereign
Claude Haiku 4.5 Fastest Claude. Sub-second latency, vision-capable.
,
200K
Proprietary
Text generationVision
Non-sovereign
Claude 3.7 Sonnet (Legacy) Previous-gen Claude. Kept available for pinned production lines.
,
200K
Proprietary
Text generationVision
Non-sovereign
GPT-5 ★ OpenAI's flagship. Top of leaderboards on most general benchmarks.
,
400K
Proprietary
Text generationReasoning
Non-sovereign
GPT-5 mini Compact GPT-5. Cost-efficient general inference.
,
400K
Proprietary
Text generationVision
Non-sovereign
GPT-4.1 GPT-4.1 with 1M context. Strong for long-document workflows.
,
1M
Proprietary
Text generationVision
Non-sovereign
GPT-4o Multimodal GPT-4o. Stable production option for chat + vision.
,
128K
Proprietary
Text generationVision
Non-sovereign
o4 Reasoning OpenAI's reasoning series. Visible chain-of-thought traces.
,
200K
Proprietary
ReasoningText generation
Non-sovereign
Kimi K2.6 ★ Kimi K2.6, 90.5% GPQA. Strong agentic and coding capabilities.
1T / 32B active
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Kimi K2 Previous Kimi flagship. Production-stable for coding agents.
1T / 32B active
200K
Apache 2.0
Text generationCode
Sovereign · AU
Grok 4.1 xAI's frontier model. Real-time data, strong reasoning.
,
256K
Proprietary
Text generationReasoning
Non-sovereign
Grok 4 Grok 4 production model. Reasoning + tool use.
,
256K
Proprietary
Text generationReasoning
Non-sovereign
Grok 3 Grok 3 for cost-efficient inference. Open-source weights available.
,
128K
Proprietary
Text generation
Non-sovereign
GLM 5.1 GLM 5.1, strong open coding + agentic performance.
356B / 32B active
128K
Apache 2.0
Text generationCode
Sovereign · AU
GLM 4.5 Production GLM 4.5. Strong multilingual, function-calling.
32B
128K
Apache 2.0
Text generation
Sovereign · AU
ChatGLM 4 9B ChatGLM 4 9B. Compact, multilingual, fast.
9B
128K
Apache 2.0
Text generation
Sovereign · AU
Yi-Large 405B Yi-Large flagship. Strong multilingual reasoning.
405B
200K
Apache 2.0
Text generationReasoning
Sovereign · AU
Yi 34B Chat Yi 34B for production. Strong cost-performance balance.
34B
200K
Apache 2.0
Text generation
Sovereign · AU
Yi-VL 34B Yi's vision-language model. Strong on chart + screen understanding.
34B
128K
Apache 2.0
VisionText generation
Sovereign · AU
Phi 4 14B Phi 4, small-model reasoning leader. MIT licence.
14B
128K
MIT
Text generationReasoning
Sovereign · AU
Phi 4 mini 3.8B Compact Phi for edge deployment. Strong instruction-following.
3.8B
128K
MIT
Text generation
Sovereign · AU
Phi 3.5 MoE Phi 3.5 MoE. Cost-efficient inference with strong quality.
42B / 6.6B active
128K
MIT
Text generation
Sovereign · AU
Phi 3.5 Vision Compact Phi vision model. Strong document + chart OCR.
4.2B
128K
MIT
VisionText generation
Sovereign · AU
Cohere Command A Cohere's flagship Command. Strong enterprise RAG + tool use.
111B
256K
Proprietary
Text generationCode
Non-sovereign
Cohere Command R+ Command R+ for enterprise RAG. Inline citation generation.
104B
128K
Proprietary
Text generation
Non-sovereign
Cohere Command R Smaller Command R. Cost-efficient RAG-tuned inference.
35B
128K
Proprietary
Text generation
Non-sovereign
Sonar Pro Online Web-grounded Sonar. Real-time citations, fresh information retrieval.
,
200K
Proprietary
Text generation
Non-sovereign
Sonar Online Cost-efficient Sonar with live web context.
,
128K
Proprietary
Text generation
Non-sovereign
Reasoning
25 Llama 4 Maverick ★ Flagship Llama 4 MoE. 400B total / 17B active. Vision + 1M context.
400B / 17B active
1M
Llama Community
Text generationVision
Sovereign · AU
Llama 3.1 405B Instruct Dense 405B Llama for heavyweight reasoning workloads.
405B
128K
Llama Community
Text generationReasoning
Sovereign · AU
Qwen 3.5 397B ★ Flagship Qwen 3.5 MoE. Tops GPQA Diamond among open weights.
397B / 17B active
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Qwen 3.5 72B Instruct Dense 72B Qwen. Strong reasoning, multilingual.
72B
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
QwQ 32B Reasoning Chain-of-thought reasoning Qwen. Strong on maths + logic.
32B
128K
Apache 2.0
ReasoningText generation
Sovereign · AU
Qwen2.5-Math 72B Math-specialist Qwen. Symbolic reasoning, proofs, MATH benchmark.
72B
128K
Apache 2.0
ReasoningText generation
Sovereign · AU
DeepSeek V4 Pro ★ Frontier open-weight. 80.6 SWE-Bench Verified, 90.1 GPQA Diamond.
1.6T / 49B active
1M
MIT
Text generationReasoning
Sovereign · AU
DeepSeek R1 ★ Open reasoning model. Visible chain-of-thought traces.
671B / 37B active
128K
MIT
ReasoningText generation
Sovereign · AU
DeepSeek R1 Distill Llama 70B R1 reasoning distilled into Llama 70B. Production-friendly latency.
70B
128K
MIT
ReasoningText generation
Sovereign · AU
DeepSeek R1 Distill Qwen 32B R1 reasoning distilled into Qwen 32B. Strong cost-performance.
32B
128K
MIT
ReasoningText generation
Sovereign · AU
DeepSeek-Math 7B Math-specialist DeepSeek. Compact, fast for symbolic work.
7B
32K
MIT
Reasoning
Sovereign · AU
Mistral Large 3 ★ Mistral's flagship open MoE. Strong multilingual, function-calling.
675B / 41B active
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Gemma 4 31B ★ Largest open Gemma 4. Strong reasoning, function-calling.
31B
256K
Gemma
Text generationReasoning
Sovereign · AU
Gemini 2.5 Pro Gemini 2.5 Pro via Google API. 2M context, strong vision + reasoning.
,
2M
Proprietary
Text generationVision
Non-sovereign
Gemini 3.1 Pro Preview Google's next-gen Gemini in preview. Reasoning + visual leadership.
,
2M
Proprietary
Text generationVision
Non-sovereign
Claude Opus 4.6 ★ Anthropic's frontier model. Industry-leading on agentic tasks.
,
500K
Proprietary
Text generationReasoning
Non-sovereign
Claude Sonnet 4.6 Production workhorse Claude. Best quality-per-dollar for most workloads.
,
500K
Proprietary
Text generationReasoning
Non-sovereign
GPT-5 ★ OpenAI's flagship. Top of leaderboards on most general benchmarks.
,
400K
Proprietary
Text generationReasoning
Non-sovereign
o4 Reasoning OpenAI's reasoning series. Visible chain-of-thought traces.
,
200K
Proprietary
ReasoningText generation
Non-sovereign
o4 mini Compact o4 reasoning. Faster, cheaper for routine reasoning workloads.
,
200K
Proprietary
Reasoning
Non-sovereign
Kimi K2.6 ★ Kimi K2.6, 90.5% GPQA. Strong agentic and coding capabilities.
1T / 32B active
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Grok 4.1 xAI's frontier model. Real-time data, strong reasoning.
,
256K
Proprietary
Text generationReasoning
Non-sovereign
Grok 4 Grok 4 production model. Reasoning + tool use.
,
256K
Proprietary
Text generationReasoning
Non-sovereign
Yi-Large 405B Yi-Large flagship. Strong multilingual reasoning.
405B
200K
Apache 2.0
Text generationReasoning
Sovereign · AU
Phi 4 14B Phi 4, small-model reasoning leader. MIT licence.
14B
128K
MIT
Text generationReasoning
Sovereign · AU
Code
19 Code Llama 70B Code Llama 70B. Fill-in-middle, repo-scale completion.
70B
100K
Llama Community
CodeText generation
Sovereign · AU
Qwen3-Coder 480B ★ Best-in-class open coding model. Repo-scale agent workflows.
480B / 35B active
1M
Apache 2.0
CodeText generation
Sovereign · AU
Qwen3-Coder 30B Workhorse coding model for IDE-integrated assistants.
30B
256K
Apache 2.0
Code
Sovereign · AU
DeepSeek V4 Pro ★ Frontier open-weight. 80.6 SWE-Bench Verified, 90.1 GPQA Diamond.
1.6T / 49B active
1M
MIT
Text generationReasoning
Sovereign · AU
DeepSeek V4 Flash Low-latency DeepSeek V4. Most of the smarts at a fraction of the cost.
37B / 6B active
256K
MIT
Text generationCode
Sovereign · AU
DeepSeek V3.5 Production workhorse before V4 Pro. Excellent code generation.
671B / 37B active
128K
MIT
Text generationCode
Sovereign · AU
DeepSeek-Coder V3 Specialist coding DeepSeek. Repo-scale, fill-in-middle.
236B / 21B active
128K
MIT
Code
Sovereign · AU
Codestral 22B Mistral's coding specialist. Multi-language, fill-in-middle.
22B
32K
Apache 2.0
Code
Sovereign · AU
Codestral Mamba 7B Mamba-architecture coding model. Linear-time long-context inference.
7B
256K
Apache 2.0
Code
Sovereign · AU
CodeGemma 7B Code-specialist Gemma. Fast, fill-in-middle, IDE-friendly.
7B
8K
Gemma
Code
Sovereign · AU
Gemini 2.5 Pro Gemini 2.5 Pro via Google API. 2M context, strong vision + reasoning.
,
2M
Proprietary
Text generationVision
Non-sovereign
Claude Opus 4.6 ★ Anthropic's frontier model. Industry-leading on agentic tasks.
,
500K
Proprietary
Text generationReasoning
Non-sovereign
Claude Sonnet 4.6 Production workhorse Claude. Best quality-per-dollar for most workloads.
,
500K
Proprietary
Text generationReasoning
Non-sovereign
GPT-5 ★ OpenAI's flagship. Top of leaderboards on most general benchmarks.
,
400K
Proprietary
Text generationReasoning
Non-sovereign
GPT-4.1 GPT-4.1 with 1M context. Strong for long-document workflows.
,
1M
Proprietary
Text generationVision
Non-sovereign
Kimi K2.6 ★ Kimi K2.6, 90.5% GPQA. Strong agentic and coding capabilities.
1T / 32B active
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Kimi K2 Previous Kimi flagship. Production-stable for coding agents.
1T / 32B active
200K
Apache 2.0
Text generationCode
Sovereign · AU
GLM 5.1 GLM 5.1, strong open coding + agentic performance.
356B / 32B active
128K
Apache 2.0
Text generationCode
Sovereign · AU
Cohere Command A Cohere's flagship Command. Strong enterprise RAG + tool use.
111B
256K
Proprietary
Text generationCode
Non-sovereign
Vision
24 Llama 4 Maverick ★ Flagship Llama 4 MoE. 400B total / 17B active. Vision + 1M context.
400B / 17B active
1M
Llama Community
Text generationVision
Sovereign · AU
Llama 4 Scout 10M context window. Built for long-document RAG and summarisation.
109B / 17B active
10M
Llama Community
Text generationVision
Sovereign · AU
Llama 3.2 11B Vision Compact vision-capable Llama. Image + text understanding.
11B
128K
Llama Community
Text generationVision
Sovereign · AU
Llama 3.2 90B Vision Large-format vision Llama for high-fidelity image reasoning.
90B
128K
Llama Community
Text generationVision
Sovereign · AU
Qwen2.5-VL 72B Vision-language Qwen. Document, chart, screen understanding.
72B
128K
Apache 2.0
VisionText generation
Sovereign · AU
Qwen2.5-VL 7B Compact Qwen-VL for high-throughput vision pipelines.
7B
128K
Apache 2.0
VisionText generation
Sovereign · AU
DeepSeek-VL2 DeepSeek vision-language. OCR + chart + screen reasoning.
27B / 4.5B active
128K
MIT
VisionText generation
Sovereign · AU
Pixtral 12B Mistral's open vision-language model. Document + UI reasoning.
12B
128K
Apache 2.0
VisionText generation
Sovereign · AU
Pixtral Large 124B Heavyweight Pixtral. State-of-the-art open multimodal.
124B
128K
Apache 2.0
VisionText generation
Sovereign · AU
PaliGemma 2 28B Google's open VLM line. Document, screen, chart understanding.
28B
8K
Gemma
VisionText generation
Sovereign · AU
Gemini 2.5 Pro Gemini 2.5 Pro via Google API. 2M context, strong vision + reasoning.
,
2M
Proprietary
Text generationVision
Non-sovereign
Gemini 2.5 Flash Low-latency Gemini. Cost-efficient general inference.
,
1M
Proprietary
Text generationVision
Non-sovereign
Gemini 3.1 Pro Preview Google's next-gen Gemini in preview. Reasoning + visual leadership.
,
2M
Proprietary
Text generationVision
Non-sovereign
Claude Opus 4.6 ★ Anthropic's frontier model. Industry-leading on agentic tasks.
,
500K
Proprietary
Text generationReasoning
Non-sovereign
Claude Sonnet 4.6 Production workhorse Claude. Best quality-per-dollar for most workloads.
,
500K
Proprietary
Text generationReasoning
Non-sovereign
Claude Haiku 4.5 Fastest Claude. Sub-second latency, vision-capable.
,
200K
Proprietary
Text generationVision
Non-sovereign
Claude 3.7 Sonnet (Legacy) Previous-gen Claude. Kept available for pinned production lines.
,
200K
Proprietary
Text generationVision
Non-sovereign
GPT-5 ★ OpenAI's flagship. Top of leaderboards on most general benchmarks.
,
400K
Proprietary
Text generationReasoning
Non-sovereign
GPT-5 mini Compact GPT-5. Cost-efficient general inference.
,
400K
Proprietary
Text generationVision
Non-sovereign
GPT-4.1 GPT-4.1 with 1M context. Strong for long-document workflows.
,
1M
Proprietary
Text generationVision
Non-sovereign
GPT-4o Multimodal GPT-4o. Stable production option for chat + vision.
,
128K
Proprietary
Text generationVision
Non-sovereign
Grok 4.1 xAI's frontier model. Real-time data, strong reasoning.
,
256K
Proprietary
Text generationReasoning
Non-sovereign
Yi-VL 34B Yi's vision-language model. Strong on chart + screen understanding.
34B
128K
Apache 2.0
VisionText generation
Sovereign · AU
Phi 3.5 Vision Compact Phi vision model. Strong document + chart OCR.
4.2B
128K
MIT
VisionText generation
Sovereign · AU
Image generation
6 Stable Diffusion 3.5 Large SD 3.5 Large. Photoreal output, strong prompt adherence.
8B
,
Proprietary
Image generation
Sovereign · AU
Stable Diffusion 3.5 Medium SD 3.5 Medium. Balanced quality vs throughput.
2.5B
,
Proprietary
Image generation
Sovereign · AU
SDXL 1.0 SDXL 1.0, stable, ecosystem-rich. ControlNet + LoRA support.
3.5B
,
Proprietary
Image generation
Sovereign · AU
FLUX.1 [pro] ★ FLUX.1 [pro], top-of-class image generation.
12B
,
Proprietary
Image generation
Sovereign · AU
FLUX.1 [dev] Open FLUX dev. Non-commercial fine-tuning permitted.
12B
,
Apache 2.0
Image generation
Sovereign · AU
FLUX.1 [schnell] Fast 4-step FLUX. Real-time interactive generation.
12B
,
Apache 2.0
Image generation
Sovereign · AU
Video
1 Stable Video Diffusion 1.1 Image-to-video diffusion. Short clip generation from a still.
1.5B
,
Proprietary
Video
Sovereign · AU
Speech-to-text
3 Whisper Large v3 Whisper Large v3. Multilingual transcription, gold standard.
1.55B
,
MIT
Speech-to-text
Sovereign · AU
Whisper Large v3 Turbo Faster Whisper Large. ~5x throughput at near-identical accuracy.
809M
,
MIT
Speech-to-text
Sovereign · AU
Distil-Whisper Large v3 Distilled Whisper. 6x faster, English-only.
756M
,
MIT
Speech-to-text
Sovereign · AU
Text-to-speech
2 ElevenLabs v3 ElevenLabs v3 voice synthesis. Multi-speaker, emotion-aware.
,
,
Proprietary
Text-to-speech
Non-sovereign
ElevenLabs Multilingual v2 29 languages, voice cloning. Production-stable.
,
,
Proprietary
Text-to-speech
Non-sovereign
Embeddings
8 BGE Large EN v1.5 BGE Large, top open English embedding model.
335M
512
MIT
Embeddings
Sovereign · AU
BGE M3 Multilingual + multifunctional. Dense, sparse, multi-vector retrieval.
560M
8K
MIT
Embeddings
Sovereign · AU
Snowflake Arctic Embed L Snowflake Arctic Embed, strong on retrieval benchmarks.
335M
512
Apache 2.0
Embeddings
Sovereign · AU
Snowflake Arctic Embed M v2 Cost-efficient Snowflake embed. Long-context variant.
305M
8K
Apache 2.0
Embeddings
Sovereign · AU
Nomic Embed Text v1.5 Nomic Embed, open weights, Matryoshka representation.
137M
8K
Apache 2.0
Embeddings
Sovereign · AU
Nomic Embed Vision v1.5 Multimodal embeddings. Joint text + image space.
92M
,
Apache 2.0
Embeddings
Sovereign · AU
Jina Embeddings v3 Jina v3, multilingual, task-specific LoRA adapters.
570M
8K
Apache 2.0
Embeddings
Sovereign · AU
Cohere Embed v4 Cohere Embed v4. Multimodal, multilingual, image + text retrieval.
,
8K
Proprietary
Embeddings
Non-sovereign
Rerank
3 BGE Reranker v2 M3 Multilingual cross-encoder reranker. Drop-in RAG quality boost.
560M
8K
MIT
Rerank
Sovereign · AU
Jina Reranker v2 Jina cross-encoder reranker. Multilingual.
278M
8K
Apache 2.0
Rerank
Sovereign · AU
Cohere Rerank v3.5 Cohere Rerank, enterprise gold standard for RAG quality.
,
4K
Proprietary
Rerank
Non-sovereign
Safety / moderation
1 Llama Guard 3 8B Content-safety classifier from Meta. Pre/post-filter for chat.
8B
128K
Llama Community
Safety / moderationText generation
Sovereign · AU
Meta
11 Llama 4 Maverick ★ Flagship Llama 4 MoE. 400B total / 17B active. Vision + 1M context.
400B / 17B active
1M
Llama Community
Text generationVision
Sovereign · AU
Llama 4 Scout 10M context window. Built for long-document RAG and summarisation.
109B / 17B active
10M
Llama Community
Text generationVision
Sovereign · AU
Llama 3.3 70B Instruct Strong all-rounder Llama 3.3 70B. Function calling, multilingual.
70B
128K
Llama Community
Text generation
Sovereign · AU
Llama 3.2 11B Vision Compact vision-capable Llama. Image + text understanding.
11B
128K
Llama Community
Text generationVision
Sovereign · AU
Llama 3.2 90B Vision Large-format vision Llama for high-fidelity image reasoning.
90B
128K
Llama Community
Text generationVision
Sovereign · AU
Llama 3.2 1B Instruct Edge-deployable Llama 3.2 for on-prem or constrained workloads.
1B
128K
Llama Community
Text generation
Sovereign · AU
Llama 3.2 3B Instruct 3B-parameter Llama for cost-efficient general inference.
3B
128K
Llama Community
Text generation
Sovereign · AU
Llama 3.1 8B Instruct Workhorse Llama 3.1 8B. Strong instruction-following.
8B
128K
Llama Community
Text generation
Sovereign · AU
Llama 3.1 405B Instruct Dense 405B Llama for heavyweight reasoning workloads.
405B
128K
Llama Community
Text generationReasoning
Sovereign · AU
Code Llama 70B Code Llama 70B. Fill-in-middle, repo-scale completion.
70B
100K
Llama Community
CodeText generation
Sovereign · AU
Llama Guard 3 8B Content-safety classifier from Meta. Pre/post-filter for chat.
8B
128K
Llama Community
Safety / moderationText generation
Sovereign · AU
Qwen
11 Qwen 3.5 397B ★ Flagship Qwen 3.5 MoE. Tops GPQA Diamond among open weights.
397B / 17B active
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Qwen 3.5 72B Instruct Dense 72B Qwen. Strong reasoning, multilingual.
72B
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Qwen 3.5 32B Instruct Mid-size Qwen 3.5 for balanced cost vs capability.
32B
128K
Apache 2.0
Text generation
Sovereign · AU
Qwen 3.5 14B Instruct Cost-efficient Qwen for high-throughput inference.
14B
128K
Apache 2.0
Text generation
Sovereign · AU
Qwen 3.5 7B Instruct Compact Qwen 7B. Fast, multilingual, easy to fine-tune.
7B
128K
Apache 2.0
Text generation
Sovereign · AU
Qwen3-Coder 480B ★ Best-in-class open coding model. Repo-scale agent workflows.
480B / 35B active
1M
Apache 2.0
CodeText generation
Sovereign · AU
Qwen3-Coder 30B Workhorse coding model for IDE-integrated assistants.
30B
256K
Apache 2.0
Code
Sovereign · AU
Qwen2.5-VL 72B Vision-language Qwen. Document, chart, screen understanding.
72B
128K
Apache 2.0
VisionText generation
Sovereign · AU
Qwen2.5-VL 7B Compact Qwen-VL for high-throughput vision pipelines.
7B
128K
Apache 2.0
VisionText generation
Sovereign · AU
QwQ 32B Reasoning Chain-of-thought reasoning Qwen. Strong on maths + logic.
32B
128K
Apache 2.0
ReasoningText generation
Sovereign · AU
Qwen2.5-Math 72B Math-specialist Qwen. Symbolic reasoning, proofs, MATH benchmark.
72B
128K
Apache 2.0
ReasoningText generation
Sovereign · AU
DeepSeek
9 DeepSeek V4 Pro ★ Frontier open-weight. 80.6 SWE-Bench Verified, 90.1 GPQA Diamond.
1.6T / 49B active
1M
MIT
Text generationReasoning
Sovereign · AU
DeepSeek V4 Flash Low-latency DeepSeek V4. Most of the smarts at a fraction of the cost.
37B / 6B active
256K
MIT
Text generationCode
Sovereign · AU
DeepSeek V3.5 Production workhorse before V4 Pro. Excellent code generation.
671B / 37B active
128K
MIT
Text generationCode
Sovereign · AU
DeepSeek R1 ★ Open reasoning model. Visible chain-of-thought traces.
671B / 37B active
128K
MIT
ReasoningText generation
Sovereign · AU
DeepSeek R1 Distill Llama 70B R1 reasoning distilled into Llama 70B. Production-friendly latency.
70B
128K
MIT
ReasoningText generation
Sovereign · AU
DeepSeek R1 Distill Qwen 32B R1 reasoning distilled into Qwen 32B. Strong cost-performance.
32B
128K
MIT
ReasoningText generation
Sovereign · AU
DeepSeek-Coder V3 Specialist coding DeepSeek. Repo-scale, fill-in-middle.
236B / 21B active
128K
MIT
Code
Sovereign · AU
DeepSeek-Math 7B Math-specialist DeepSeek. Compact, fast for symbolic work.
7B
32K
MIT
Reasoning
Sovereign · AU
DeepSeek-VL2 DeepSeek vision-language. OCR + chart + screen reasoning.
27B / 4.5B active
128K
MIT
VisionText generation
Sovereign · AU
Mistral
11 Mistral Large 3 ★ Mistral's flagship open MoE. Strong multilingual, function-calling.
675B / 41B active
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Mistral Medium 3.5 Mid-tier Mistral. Strong cost-to-quality ratio for production.
70B
128K
Apache 2.0
Text generation
Sovereign · AU
Mistral Small 3.1 Small Mistral. Edge-deployable, multilingual, fast.
24B
128K
Apache 2.0
Text generation
Sovereign · AU
Mistral Nemo 12B Joint Mistral × NVIDIA. Strong multilingual, FP8-friendly.
12B
128K
Apache 2.0
Text generation
Sovereign · AU
Mixtral 8x22B Mixtral 8x22B MoE. Cost-efficient inference at scale.
141B / 39B active
64K
Apache 2.0
Text generation
Sovereign · AU
Mixtral 8x7B Original Mixtral MoE. Still strong for general inference.
47B / 13B active
32K
Apache 2.0
Text generation
Sovereign · AU
Codestral 22B Mistral's coding specialist. Multi-language, fill-in-middle.
22B
32K
Apache 2.0
Code
Sovereign · AU
Codestral Mamba 7B Mamba-architecture coding model. Linear-time long-context inference.
7B
256K
Apache 2.0
Code
Sovereign · AU
Pixtral 12B Mistral's open vision-language model. Document + UI reasoning.
12B
128K
Apache 2.0
VisionText generation
Sovereign · AU
Pixtral Large 124B Heavyweight Pixtral. State-of-the-art open multimodal.
124B
128K
Apache 2.0
VisionText generation
Sovereign · AU
Mistral Saba 24B Arabic-specialist Mistral. Strong MENA-region language performance.
24B
32K
Apache 2.0
Text generation
Sovereign · AU
Gemma 4 31B ★ Largest open Gemma 4. Strong reasoning, function-calling.
31B
256K
Gemma
Text generationReasoning
Sovereign · AU
Gemma 4 9B Mid-size Gemma 4. Best 9B-class open model on multilingual tasks.
9B
128K
Gemma
Text generation
Sovereign · AU
Gemma 4 2B Tiny Gemma 4. Edge / on-device deployments.
2B
128K
Gemma
Text generation
Sovereign · AU
Gemma 2 27B Stable Gemma 2 release. Still strong for general production.
27B
8K
Gemma
Text generation
Sovereign · AU
CodeGemma 7B Code-specialist Gemma. Fast, fill-in-middle, IDE-friendly.
7B
8K
Gemma
Code
Sovereign · AU
PaliGemma 2 28B Google's open VLM line. Document, screen, chart understanding.
28B
8K
Gemma
VisionText generation
Sovereign · AU
Gemini 2.5 Pro Gemini 2.5 Pro via Google API. 2M context, strong vision + reasoning.
,
2M
Proprietary
Text generationVision
Non-sovereign
Gemini 2.5 Flash Low-latency Gemini. Cost-efficient general inference.
,
1M
Proprietary
Text generationVision
Non-sovereign
Gemini 3.1 Pro Preview Google's next-gen Gemini in preview. Reasoning + visual leadership.
,
2M
Proprietary
Text generationVision
Non-sovereign
Anthropic
4 Claude Opus 4.6 ★ Anthropic's frontier model. Industry-leading on agentic tasks.
,
500K
Proprietary
Text generationReasoning
Non-sovereign
Claude Sonnet 4.6 Production workhorse Claude. Best quality-per-dollar for most workloads.
,
500K
Proprietary
Text generationReasoning
Non-sovereign
Claude Haiku 4.5 Fastest Claude. Sub-second latency, vision-capable.
,
200K
Proprietary
Text generationVision
Non-sovereign
Claude 3.7 Sonnet (Legacy) Previous-gen Claude. Kept available for pinned production lines.
,
200K
Proprietary
Text generationVision
Non-sovereign
OpenAI
6 GPT-5 ★ OpenAI's flagship. Top of leaderboards on most general benchmarks.
,
400K
Proprietary
Text generationReasoning
Non-sovereign
GPT-5 mini Compact GPT-5. Cost-efficient general inference.
,
400K
Proprietary
Text generationVision
Non-sovereign
GPT-4.1 GPT-4.1 with 1M context. Strong for long-document workflows.
,
1M
Proprietary
Text generationVision
Non-sovereign
GPT-4o Multimodal GPT-4o. Stable production option for chat + vision.
,
128K
Proprietary
Text generationVision
Non-sovereign
o4 Reasoning OpenAI's reasoning series. Visible chain-of-thought traces.
,
200K
Proprietary
ReasoningText generation
Non-sovereign
o4 mini Compact o4 reasoning. Faster, cheaper for routine reasoning workloads.
,
200K
Proprietary
Reasoning
Non-sovereign
Kimi
2 Kimi K2.6 ★ Kimi K2.6, 90.5% GPQA. Strong agentic and coding capabilities.
1T / 32B active
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Kimi K2 Previous Kimi flagship. Production-stable for coding agents.
1T / 32B active
200K
Apache 2.0
Text generationCode
Sovereign · AU
xAI
3 Grok 4.1 xAI's frontier model. Real-time data, strong reasoning.
,
256K
Proprietary
Text generationReasoning
Non-sovereign
Grok 4 Grok 4 production model. Reasoning + tool use.
,
256K
Proprietary
Text generationReasoning
Non-sovereign
Grok 3 Grok 3 for cost-efficient inference. Open-source weights available.
,
128K
Proprietary
Text generation
Non-sovereign
Zhipu
3 GLM 5.1 GLM 5.1, strong open coding + agentic performance.
356B / 32B active
128K
Apache 2.0
Text generationCode
Sovereign · AU
GLM 4.5 Production GLM 4.5. Strong multilingual, function-calling.
32B
128K
Apache 2.0
Text generation
Sovereign · AU
ChatGLM 4 9B ChatGLM 4 9B. Compact, multilingual, fast.
9B
128K
Apache 2.0
Text generation
Sovereign · AU
Yi
3 Yi-Large 405B Yi-Large flagship. Strong multilingual reasoning.
405B
200K
Apache 2.0
Text generationReasoning
Sovereign · AU
Yi 34B Chat Yi 34B for production. Strong cost-performance balance.
34B
200K
Apache 2.0
Text generation
Sovereign · AU
Yi-VL 34B Yi's vision-language model. Strong on chart + screen understanding.
34B
128K
Apache 2.0
VisionText generation
Sovereign · AU
Microsoft
4 Phi 4 14B Phi 4, small-model reasoning leader. MIT licence.
14B
128K
MIT
Text generationReasoning
Sovereign · AU
Phi 4 mini 3.8B Compact Phi for edge deployment. Strong instruction-following.
3.8B
128K
MIT
Text generation
Sovereign · AU
Phi 3.5 MoE Phi 3.5 MoE. Cost-efficient inference with strong quality.
42B / 6.6B active
128K
MIT
Text generation
Sovereign · AU
Phi 3.5 Vision Compact Phi vision model. Strong document + chart OCR.
4.2B
128K
MIT
VisionText generation
Sovereign · AU
Cohere
5 Cohere Command A Cohere's flagship Command. Strong enterprise RAG + tool use.
111B
256K
Proprietary
Text generationCode
Non-sovereign
Cohere Command R+ Command R+ for enterprise RAG. Inline citation generation.
104B
128K
Proprietary
Text generation
Non-sovereign
Cohere Command R Smaller Command R. Cost-efficient RAG-tuned inference.
35B
128K
Proprietary
Text generation
Non-sovereign
Cohere Embed v4 Cohere Embed v4. Multimodal, multilingual, image + text retrieval.
,
8K
Proprietary
Embeddings
Non-sovereign
Cohere Rerank v3.5 Cohere Rerank, enterprise gold standard for RAG quality.
,
4K
Proprietary
Rerank
Non-sovereign
Stability AI
4 Stable Diffusion 3.5 Large SD 3.5 Large. Photoreal output, strong prompt adherence.
8B
,
Proprietary
Image generation
Sovereign · AU
Stable Diffusion 3.5 Medium SD 3.5 Medium. Balanced quality vs throughput.
2.5B
,
Proprietary
Image generation
Sovereign · AU
Stable Video Diffusion 1.1 Image-to-video diffusion. Short clip generation from a still.
1.5B
,
Proprietary
Video
Sovereign · AU
SDXL 1.0 SDXL 1.0, stable, ecosystem-rich. ControlNet + LoRA support.
3.5B
,
Proprietary
Image generation
Sovereign · AU
Black Forest Labs
3 FLUX.1 [pro] ★ FLUX.1 [pro], top-of-class image generation.
12B
,
Proprietary
Image generation
Sovereign · AU
FLUX.1 [dev] Open FLUX dev. Non-commercial fine-tuning permitted.
12B
,
Apache 2.0
Image generation
Sovereign · AU
FLUX.1 [schnell] Fast 4-step FLUX. Real-time interactive generation.
12B
,
Apache 2.0
Image generation
Sovereign · AU
Whisper
3 Whisper Large v3 Whisper Large v3. Multilingual transcription, gold standard.
1.55B
,
MIT
Speech-to-text
Sovereign · AU
Whisper Large v3 Turbo Faster Whisper Large. ~5x throughput at near-identical accuracy.
809M
,
MIT
Speech-to-text
Sovereign · AU
Distil-Whisper Large v3 Distilled Whisper. 6x faster, English-only.
756M
,
MIT
Speech-to-text
Sovereign · AU
ElevenLabs
2 ElevenLabs v3 ElevenLabs v3 voice synthesis. Multi-speaker, emotion-aware.
,
,
Proprietary
Text-to-speech
Non-sovereign
ElevenLabs Multilingual v2 29 languages, voice cloning. Production-stable.
,
,
Proprietary
Text-to-speech
Non-sovereign
BGE
3 BGE Large EN v1.5 BGE Large, top open English embedding model.
335M
512
MIT
Embeddings
Sovereign · AU
BGE M3 Multilingual + multifunctional. Dense, sparse, multi-vector retrieval.
560M
8K
MIT
Embeddings
Sovereign · AU
BGE Reranker v2 M3 Multilingual cross-encoder reranker. Drop-in RAG quality boost.
560M
8K
MIT
Rerank
Sovereign · AU
Snowflake
2 Snowflake Arctic Embed L Snowflake Arctic Embed, strong on retrieval benchmarks.
335M
512
Apache 2.0
Embeddings
Sovereign · AU
Snowflake Arctic Embed M v2 Cost-efficient Snowflake embed. Long-context variant.
305M
8K
Apache 2.0
Embeddings
Sovereign · AU
Nomic
2 Nomic Embed Text v1.5 Nomic Embed, open weights, Matryoshka representation.
137M
8K
Apache 2.0
Embeddings
Sovereign · AU
Nomic Embed Vision v1.5 Multimodal embeddings. Joint text + image space.
92M
,
Apache 2.0
Embeddings
Sovereign · AU
Jina
2 Jina Embeddings v3 Jina v3, multilingual, task-specific LoRA adapters.
570M
8K
Apache 2.0
Embeddings
Sovereign · AU
Jina Reranker v2 Jina cross-encoder reranker. Multilingual.
278M
8K
Apache 2.0
Rerank
Sovereign · AU
Perplexity
2 Sonar Pro Online Web-grounded Sonar. Real-time citations, fresh information retrieval.
,
200K
Proprietary
Text generation
Non-sovereign
Sonar Online Cost-efficient Sonar with live web context.
,
128K
Proprietary
Text generation
Non-sovereign
Chatbots & assistants
41 Llama 4 Maverick ★ Flagship Llama 4 MoE. 400B total / 17B active. Vision + 1M context.
400B / 17B active
1M
Llama Community
Text generationVision
Sovereign · AU
Llama 4 Scout 10M context window. Built for long-document RAG and summarisation.
109B / 17B active
10M
Llama Community
Text generationVision
Sovereign · AU
Llama 3.3 70B Instruct Strong all-rounder Llama 3.3 70B. Function calling, multilingual.
70B
128K
Llama Community
Text generation
Sovereign · AU
Llama 3.2 1B Instruct Edge-deployable Llama 3.2 for on-prem or constrained workloads.
1B
128K
Llama Community
Text generation
Sovereign · AU
Llama 3.2 3B Instruct 3B-parameter Llama for cost-efficient general inference.
3B
128K
Llama Community
Text generation
Sovereign · AU
Llama 3.1 8B Instruct Workhorse Llama 3.1 8B. Strong instruction-following.
8B
128K
Llama Community
Text generation
Sovereign · AU
Qwen 3.5 32B Instruct Mid-size Qwen 3.5 for balanced cost vs capability.
32B
128K
Apache 2.0
Text generation
Sovereign · AU
Qwen 3.5 14B Instruct Cost-efficient Qwen for high-throughput inference.
14B
128K
Apache 2.0
Text generation
Sovereign · AU
Qwen 3.5 7B Instruct Compact Qwen 7B. Fast, multilingual, easy to fine-tune.
7B
128K
Apache 2.0
Text generation
Sovereign · AU
DeepSeek V4 Flash Low-latency DeepSeek V4. Most of the smarts at a fraction of the cost.
37B / 6B active
256K
MIT
Text generationCode
Sovereign · AU
Mistral Large 3 ★ Mistral's flagship open MoE. Strong multilingual, function-calling.
675B / 41B active
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Mistral Medium 3.5 Mid-tier Mistral. Strong cost-to-quality ratio for production.
70B
128K
Apache 2.0
Text generation
Sovereign · AU
Mistral Small 3.1 Small Mistral. Edge-deployable, multilingual, fast.
24B
128K
Apache 2.0
Text generation
Sovereign · AU
Mistral Nemo 12B Joint Mistral × NVIDIA. Strong multilingual, FP8-friendly.
12B
128K
Apache 2.0
Text generation
Sovereign · AU
Mixtral 8x22B Mixtral 8x22B MoE. Cost-efficient inference at scale.
141B / 39B active
64K
Apache 2.0
Text generation
Sovereign · AU
Mixtral 8x7B Original Mixtral MoE. Still strong for general inference.
47B / 13B active
32K
Apache 2.0
Text generation
Sovereign · AU
Mistral Saba 24B Arabic-specialist Mistral. Strong MENA-region language performance.
24B
32K
Apache 2.0
Text generation
Sovereign · AU
Gemma 4 31B ★ Largest open Gemma 4. Strong reasoning, function-calling.
31B
256K
Gemma
Text generationReasoning
Sovereign · AU
Gemma 4 9B Mid-size Gemma 4. Best 9B-class open model on multilingual tasks.
9B
128K
Gemma
Text generation
Sovereign · AU
Gemma 4 2B Tiny Gemma 4. Edge / on-device deployments.
2B
128K
Gemma
Text generation
Sovereign · AU
Gemma 2 27B Stable Gemma 2 release. Still strong for general production.
27B
8K
Gemma
Text generation
Sovereign · AU
Gemini 2.5 Flash Low-latency Gemini. Cost-efficient general inference.
,
1M
Proprietary
Text generationVision
Non-sovereign
Claude Sonnet 4.6 Production workhorse Claude. Best quality-per-dollar for most workloads.
,
500K
Proprietary
Text generationReasoning
Non-sovereign
Claude Haiku 4.5 Fastest Claude. Sub-second latency, vision-capable.
,
200K
Proprietary
Text generationVision
Non-sovereign
Claude 3.7 Sonnet (Legacy) Previous-gen Claude. Kept available for pinned production lines.
,
200K
Proprietary
Text generationVision
Non-sovereign
GPT-5 mini Compact GPT-5. Cost-efficient general inference.
,
400K
Proprietary
Text generationVision
Non-sovereign
GPT-4.1 GPT-4.1 with 1M context. Strong for long-document workflows.
,
1M
Proprietary
Text generationVision
Non-sovereign
GPT-4o Multimodal GPT-4o. Stable production option for chat + vision.
,
128K
Proprietary
Text generationVision
Non-sovereign
Grok 4.1 xAI's frontier model. Real-time data, strong reasoning.
,
256K
Proprietary
Text generationReasoning
Non-sovereign
Grok 4 Grok 4 production model. Reasoning + tool use.
,
256K
Proprietary
Text generationReasoning
Non-sovereign
Grok 3 Grok 3 for cost-efficient inference. Open-source weights available.
,
128K
Proprietary
Text generation
Non-sovereign
GLM 4.5 Production GLM 4.5. Strong multilingual, function-calling.
32B
128K
Apache 2.0
Text generation
Sovereign · AU
ChatGLM 4 9B ChatGLM 4 9B. Compact, multilingual, fast.
9B
128K
Apache 2.0
Text generation
Sovereign · AU
Yi-Large 405B Yi-Large flagship. Strong multilingual reasoning.
405B
200K
Apache 2.0
Text generationReasoning
Sovereign · AU
Yi 34B Chat Yi 34B for production. Strong cost-performance balance.
34B
200K
Apache 2.0
Text generation
Sovereign · AU
Phi 4 14B Phi 4, small-model reasoning leader. MIT licence.
14B
128K
MIT
Text generationReasoning
Sovereign · AU
Phi 4 mini 3.8B Compact Phi for edge deployment. Strong instruction-following.
3.8B
128K
MIT
Text generation
Sovereign · AU
Phi 3.5 MoE Phi 3.5 MoE. Cost-efficient inference with strong quality.
42B / 6.6B active
128K
MIT
Text generation
Sovereign · AU
Cohere Command A Cohere's flagship Command. Strong enterprise RAG + tool use.
111B
256K
Proprietary
Text generationCode
Non-sovereign
Sonar Pro Online Web-grounded Sonar. Real-time citations, fresh information retrieval.
,
200K
Proprietary
Text generation
Non-sovereign
Sonar Online Cost-efficient Sonar with live web context.
,
128K
Proprietary
Text generation
Non-sovereign
Agents
27 Llama 4 Maverick ★ Flagship Llama 4 MoE. 400B total / 17B active. Vision + 1M context.
400B / 17B active
1M
Llama Community
Text generationVision
Sovereign · AU
Llama 3.3 70B Instruct Strong all-rounder Llama 3.3 70B. Function calling, multilingual.
70B
128K
Llama Community
Text generation
Sovereign · AU
Llama 3.2 3B Instruct 3B-parameter Llama for cost-efficient general inference.
3B
128K
Llama Community
Text generation
Sovereign · AU
Llama 3.1 8B Instruct Workhorse Llama 3.1 8B. Strong instruction-following.
8B
128K
Llama Community
Text generation
Sovereign · AU
Llama 3.1 405B Instruct Dense 405B Llama for heavyweight reasoning workloads.
405B
128K
Llama Community
Text generationReasoning
Sovereign · AU
Qwen 3.5 397B ★ Flagship Qwen 3.5 MoE. Tops GPQA Diamond among open weights.
397B / 17B active
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Qwen 3.5 72B Instruct Dense 72B Qwen. Strong reasoning, multilingual.
72B
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Qwen 3.5 32B Instruct Mid-size Qwen 3.5 for balanced cost vs capability.
32B
128K
Apache 2.0
Text generation
Sovereign · AU
Qwen3-Coder 480B ★ Best-in-class open coding model. Repo-scale agent workflows.
480B / 35B active
1M
Apache 2.0
CodeText generation
Sovereign · AU
DeepSeek V4 Pro ★ Frontier open-weight. 80.6 SWE-Bench Verified, 90.1 GPQA Diamond.
1.6T / 49B active
1M
MIT
Text generationReasoning
Sovereign · AU
DeepSeek V4 Flash Low-latency DeepSeek V4. Most of the smarts at a fraction of the cost.
37B / 6B active
256K
MIT
Text generationCode
Sovereign · AU
DeepSeek V3.5 Production workhorse before V4 Pro. Excellent code generation.
671B / 37B active
128K
MIT
Text generationCode
Sovereign · AU
Mistral Large 3 ★ Mistral's flagship open MoE. Strong multilingual, function-calling.
675B / 41B active
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Mistral Medium 3.5 Mid-tier Mistral. Strong cost-to-quality ratio for production.
70B
128K
Apache 2.0
Text generation
Sovereign · AU
Mixtral 8x22B Mixtral 8x22B MoE. Cost-efficient inference at scale.
141B / 39B active
64K
Apache 2.0
Text generation
Sovereign · AU
Gemini 2.5 Pro Gemini 2.5 Pro via Google API. 2M context, strong vision + reasoning.
,
2M
Proprietary
Text generationVision
Non-sovereign
Gemini 2.5 Flash Low-latency Gemini. Cost-efficient general inference.
,
1M
Proprietary
Text generationVision
Non-sovereign
Gemini 3.1 Pro Preview Google's next-gen Gemini in preview. Reasoning + visual leadership.
,
2M
Proprietary
Text generationVision
Non-sovereign
Claude Opus 4.6 ★ Anthropic's frontier model. Industry-leading on agentic tasks.
,
500K
Proprietary
Text generationReasoning
Non-sovereign
Claude Sonnet 4.6 Production workhorse Claude. Best quality-per-dollar for most workloads.
,
500K
Proprietary
Text generationReasoning
Non-sovereign
GPT-5 ★ OpenAI's flagship. Top of leaderboards on most general benchmarks.
,
400K
Proprietary
Text generationReasoning
Non-sovereign
GPT-5 mini Compact GPT-5. Cost-efficient general inference.
,
400K
Proprietary
Text generationVision
Non-sovereign
Kimi K2.6 ★ Kimi K2.6, 90.5% GPQA. Strong agentic and coding capabilities.
1T / 32B active
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Kimi K2 Previous Kimi flagship. Production-stable for coding agents.
1T / 32B active
200K
Apache 2.0
Text generationCode
Sovereign · AU
GLM 5.1 GLM 5.1, strong open coding + agentic performance.
356B / 32B active
128K
Apache 2.0
Text generationCode
Sovereign · AU
Cohere Command A Cohere's flagship Command. Strong enterprise RAG + tool use.
111B
256K
Proprietary
Text generationCode
Non-sovereign
Cohere Command R+ Command R+ for enterprise RAG. Inline citation generation.
104B
128K
Proprietary
Text generation
Non-sovereign
Coding
19 Code Llama 70B Code Llama 70B. Fill-in-middle, repo-scale completion.
70B
100K
Llama Community
CodeText generation
Sovereign · AU
Qwen 3.5 397B ★ Flagship Qwen 3.5 MoE. Tops GPQA Diamond among open weights.
397B / 17B active
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Qwen3-Coder 480B ★ Best-in-class open coding model. Repo-scale agent workflows.
480B / 35B active
1M
Apache 2.0
CodeText generation
Sovereign · AU
Qwen3-Coder 30B Workhorse coding model for IDE-integrated assistants.
30B
256K
Apache 2.0
Code
Sovereign · AU
DeepSeek V4 Pro ★ Frontier open-weight. 80.6 SWE-Bench Verified, 90.1 GPQA Diamond.
1.6T / 49B active
1M
MIT
Text generationReasoning
Sovereign · AU
DeepSeek V4 Flash Low-latency DeepSeek V4. Most of the smarts at a fraction of the cost.
37B / 6B active
256K
MIT
Text generationCode
Sovereign · AU
DeepSeek V3.5 Production workhorse before V4 Pro. Excellent code generation.
671B / 37B active
128K
MIT
Text generationCode
Sovereign · AU
DeepSeek-Coder V3 Specialist coding DeepSeek. Repo-scale, fill-in-middle.
236B / 21B active
128K
MIT
Code
Sovereign · AU
Codestral 22B Mistral's coding specialist. Multi-language, fill-in-middle.
22B
32K
Apache 2.0
Code
Sovereign · AU
Codestral Mamba 7B Mamba-architecture coding model. Linear-time long-context inference.
7B
256K
Apache 2.0
Code
Sovereign · AU
CodeGemma 7B Code-specialist Gemma. Fast, fill-in-middle, IDE-friendly.
7B
8K
Gemma
Code
Sovereign · AU
Gemini 2.5 Pro Gemini 2.5 Pro via Google API. 2M context, strong vision + reasoning.
,
2M
Proprietary
Text generationVision
Non-sovereign
Claude Opus 4.6 ★ Anthropic's frontier model. Industry-leading on agentic tasks.
,
500K
Proprietary
Text generationReasoning
Non-sovereign
Claude Sonnet 4.6 Production workhorse Claude. Best quality-per-dollar for most workloads.
,
500K
Proprietary
Text generationReasoning
Non-sovereign
GPT-5 ★ OpenAI's flagship. Top of leaderboards on most general benchmarks.
,
400K
Proprietary
Text generationReasoning
Non-sovereign
GPT-4.1 GPT-4.1 with 1M context. Strong for long-document workflows.
,
1M
Proprietary
Text generationVision
Non-sovereign
Kimi K2.6 ★ Kimi K2.6, 90.5% GPQA. Strong agentic and coding capabilities.
1T / 32B active
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Kimi K2 Previous Kimi flagship. Production-stable for coding agents.
1T / 32B active
200K
Apache 2.0
Text generationCode
Sovereign · AU
GLM 5.1 GLM 5.1, strong open coding + agentic performance.
356B / 32B active
128K
Apache 2.0
Text generationCode
Sovereign · AU
Reasoning
24 Llama 4 Maverick ★ Flagship Llama 4 MoE. 400B total / 17B active. Vision + 1M context.
400B / 17B active
1M
Llama Community
Text generationVision
Sovereign · AU
Llama 3.1 405B Instruct Dense 405B Llama for heavyweight reasoning workloads.
405B
128K
Llama Community
Text generationReasoning
Sovereign · AU
Qwen 3.5 397B ★ Flagship Qwen 3.5 MoE. Tops GPQA Diamond among open weights.
397B / 17B active
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Qwen 3.5 72B Instruct Dense 72B Qwen. Strong reasoning, multilingual.
72B
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
QwQ 32B Reasoning Chain-of-thought reasoning Qwen. Strong on maths + logic.
32B
128K
Apache 2.0
ReasoningText generation
Sovereign · AU
Qwen2.5-Math 72B Math-specialist Qwen. Symbolic reasoning, proofs, MATH benchmark.
72B
128K
Apache 2.0
ReasoningText generation
Sovereign · AU
DeepSeek V4 Pro ★ Frontier open-weight. 80.6 SWE-Bench Verified, 90.1 GPQA Diamond.
1.6T / 49B active
1M
MIT
Text generationReasoning
Sovereign · AU
DeepSeek R1 ★ Open reasoning model. Visible chain-of-thought traces.
671B / 37B active
128K
MIT
ReasoningText generation
Sovereign · AU
DeepSeek R1 Distill Llama 70B R1 reasoning distilled into Llama 70B. Production-friendly latency.
70B
128K
MIT
ReasoningText generation
Sovereign · AU
DeepSeek R1 Distill Qwen 32B R1 reasoning distilled into Qwen 32B. Strong cost-performance.
32B
128K
MIT
ReasoningText generation
Sovereign · AU
DeepSeek-Math 7B Math-specialist DeepSeek. Compact, fast for symbolic work.
7B
32K
MIT
Reasoning
Sovereign · AU
Mistral Large 3 ★ Mistral's flagship open MoE. Strong multilingual, function-calling.
675B / 41B active
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Gemma 4 31B ★ Largest open Gemma 4. Strong reasoning, function-calling.
31B
256K
Gemma
Text generationReasoning
Sovereign · AU
Gemini 2.5 Pro Gemini 2.5 Pro via Google API. 2M context, strong vision + reasoning.
,
2M
Proprietary
Text generationVision
Non-sovereign
Gemini 3.1 Pro Preview Google's next-gen Gemini in preview. Reasoning + visual leadership.
,
2M
Proprietary
Text generationVision
Non-sovereign
Claude Opus 4.6 ★ Anthropic's frontier model. Industry-leading on agentic tasks.
,
500K
Proprietary
Text generationReasoning
Non-sovereign
GPT-5 ★ OpenAI's flagship. Top of leaderboards on most general benchmarks.
,
400K
Proprietary
Text generationReasoning
Non-sovereign
o4 Reasoning OpenAI's reasoning series. Visible chain-of-thought traces.
,
200K
Proprietary
ReasoningText generation
Non-sovereign
o4 mini Compact o4 reasoning. Faster, cheaper for routine reasoning workloads.
,
200K
Proprietary
Reasoning
Non-sovereign
Kimi K2.6 ★ Kimi K2.6, 90.5% GPQA. Strong agentic and coding capabilities.
1T / 32B active
256K
Apache 2.0
Text generationReasoning
Sovereign · AU
Grok 4.1 xAI's frontier model. Real-time data, strong reasoning.
,
256K
Proprietary
Text generationReasoning
Non-sovereign
Grok 4 Grok 4 production model. Reasoning + tool use.
,
256K
Proprietary
Text generationReasoning
Non-sovereign
Yi-Large 405B Yi-Large flagship. Strong multilingual reasoning.
405B
200K
Apache 2.0
Text generationReasoning
Sovereign · AU
Phi 4 14B Phi 4, small-model reasoning leader. MIT licence.
14B
128K
MIT
Text generationReasoning
Sovereign · AU
Document analysis
12 Llama 4 Maverick ★ Flagship Llama 4 MoE. 400B total / 17B active. Vision + 1M context.
400B / 17B active
1M
Llama Community
Text generationVision
Sovereign · AU
Llama 4 Scout 10M context window. Built for long-document RAG and summarisation.
109B / 17B active
10M
Llama Community
Text generationVision
Sovereign · AU
Llama 3.2 11B Vision Compact vision-capable Llama. Image + text understanding.
11B
128K
Llama Community
Text generationVision
Sovereign · AU
Llama 3.2 90B Vision Large-format vision Llama for high-fidelity image reasoning.
90B
128K
Llama Community
Text generationVision
Sovereign · AU
Llama 3.1 405B Instruct Dense 405B Llama for heavyweight reasoning workloads.
405B
128K
Llama Community
Text generationReasoning
Sovereign · AU
Qwen2.5-VL 72B Vision-language Qwen. Document, chart, screen understanding.
72B
128K
Apache 2.0
VisionText generation
Sovereign · AU
DeepSeek-VL2 DeepSeek vision-language. OCR + chart + screen reasoning.
27B / 4.5B active
128K
MIT
VisionText generation
Sovereign · AU
Pixtral 12B Mistral's open vision-language model. Document + UI reasoning.
12B
128K
Apache 2.0
VisionText generation
Sovereign · AU
Pixtral Large 124B Heavyweight Pixtral. State-of-the-art open multimodal.
124B
128K
Apache 2.0
VisionText generation
Sovereign · AU
Gemini 2.5 Pro Gemini 2.5 Pro via Google API. 2M context, strong vision + reasoning.
,
2M
Proprietary
Text generationVision
Non-sovereign
Claude Opus 4.6 ★ Anthropic's frontier model. Industry-leading on agentic tasks.
,
500K
Proprietary
Text generationReasoning
Non-sovereign
GPT-4.1 GPT-4.1 with 1M context. Strong for long-document workflows.
,
1M
Proprietary
Text generationVision
Non-sovereign
RAG & retrieval
16 Llama 4 Scout 10M context window. Built for long-document RAG and summarisation.
109B / 17B active
10M
Llama Community
Text generationVision
Sovereign · AU
Cohere Command R+ Command R+ for enterprise RAG. Inline citation generation.
104B
128K
Proprietary
Text generation
Non-sovereign
Cohere Command R Smaller Command R. Cost-efficient RAG-tuned inference.
35B
128K
Proprietary
Text generation
Non-sovereign
BGE Large EN v1.5 BGE Large, top open English embedding model.
335M
512
MIT
Embeddings
Sovereign · AU
BGE M3 Multilingual + multifunctional. Dense, sparse, multi-vector retrieval.
560M
8K
MIT
Embeddings
Sovereign · AU
BGE Reranker v2 M3 Multilingual cross-encoder reranker. Drop-in RAG quality boost.
560M
8K
MIT
Rerank
Sovereign · AU
Snowflake Arctic Embed L Snowflake Arctic Embed, strong on retrieval benchmarks.
335M
512
Apache 2.0
Embeddings
Sovereign · AU
Snowflake Arctic Embed M v2 Cost-efficient Snowflake embed. Long-context variant.
305M
8K
Apache 2.0
Embeddings
Sovereign · AU
Nomic Embed Text v1.5 Nomic Embed, open weights, Matryoshka representation.
137M
8K
Apache 2.0
Embeddings
Sovereign · AU
Nomic Embed Vision v1.5 Multimodal embeddings. Joint text + image space.
92M
,
Apache 2.0
Embeddings
Sovereign · AU
Jina Embeddings v3 Jina v3, multilingual, task-specific LoRA adapters.
570M
8K
Apache 2.0
Embeddings
Sovereign · AU
Jina Reranker v2 Jina cross-encoder reranker. Multilingual.
278M
8K
Apache 2.0
Rerank
Sovereign · AU
Cohere Embed v4 Cohere Embed v4. Multimodal, multilingual, image + text retrieval.
,
8K
Proprietary
Embeddings
Non-sovereign
Cohere Rerank v3.5 Cohere Rerank, enterprise gold standard for RAG quality.
,
4K
Proprietary
Rerank
Non-sovereign
Sonar Pro Online Web-grounded Sonar. Real-time citations, fresh information retrieval.
,
200K
Proprietary
Text generation
Non-sovereign
Sonar Online Cost-efficient Sonar with live web context.
,
128K
Proprietary
Text generation
Non-sovereign
Image work
17 Llama 3.2 11B Vision Compact vision-capable Llama. Image + text understanding.
11B
128K
Llama Community
Text generationVision
Sovereign · AU
Llama 3.2 90B Vision Large-format vision Llama for high-fidelity image reasoning.
90B
128K
Llama Community
Text generationVision
Sovereign · AU
Qwen2.5-VL 72B Vision-language Qwen. Document, chart, screen understanding.
72B
128K
Apache 2.0
VisionText generation
Sovereign · AU
Qwen2.5-VL 7B Compact Qwen-VL for high-throughput vision pipelines.
7B
128K
Apache 2.0
VisionText generation
Sovereign · AU
DeepSeek-VL2 DeepSeek vision-language. OCR + chart + screen reasoning.
27B / 4.5B active
128K
MIT
VisionText generation
Sovereign · AU
Pixtral 12B Mistral's open vision-language model. Document + UI reasoning.
12B
128K
Apache 2.0
VisionText generation
Sovereign · AU
Pixtral Large 124B Heavyweight Pixtral. State-of-the-art open multimodal.
124B
128K
Apache 2.0
VisionText generation
Sovereign · AU
PaliGemma 2 28B Google's open VLM line. Document, screen, chart understanding.
28B
8K
Gemma
VisionText generation
Sovereign · AU
Yi-VL 34B Yi's vision-language model. Strong on chart + screen understanding.
34B
128K
Apache 2.0
VisionText generation
Sovereign · AU
Phi 3.5 Vision Compact Phi vision model. Strong document + chart OCR.
4.2B
128K
MIT
VisionText generation
Sovereign · AU
Stable Diffusion 3.5 Large SD 3.5 Large. Photoreal output, strong prompt adherence.
8B
,
Proprietary
Image generation
Sovereign · AU
Stable Diffusion 3.5 Medium SD 3.5 Medium. Balanced quality vs throughput.
2.5B
,
Proprietary
Image generation
Sovereign · AU
SDXL 1.0 SDXL 1.0, stable, ecosystem-rich. ControlNet + LoRA support.
3.5B
,
Proprietary
Image generation
Sovereign · AU
FLUX.1 [pro] ★ FLUX.1 [pro], top-of-class image generation.
12B
,
Proprietary
Image generation
Sovereign · AU
FLUX.1 [dev] Open FLUX dev. Non-commercial fine-tuning permitted.
12B
,
Apache 2.0
Image generation
Sovereign · AU
FLUX.1 [schnell] Fast 4-step FLUX. Real-time interactive generation.
12B
,
Apache 2.0
Image generation
Sovereign · AU
Nomic Embed Vision v1.5 Multimodal embeddings. Joint text + image space.
92M
,
Apache 2.0
Embeddings
Sovereign · AU
Video work
1 Stable Video Diffusion 1.1 Image-to-video diffusion. Short clip generation from a still.
1.5B
,
Proprietary
Video
Sovereign · AU
Voice
5 Whisper Large v3 Whisper Large v3. Multilingual transcription, gold standard.
1.55B
,
MIT
Speech-to-text
Sovereign · AU
Whisper Large v3 Turbo Faster Whisper Large. ~5x throughput at near-identical accuracy.
809M
,
MIT
Speech-to-text
Sovereign · AU
Distil-Whisper Large v3 Distilled Whisper. 6x faster, English-only.
756M
,
MIT
Speech-to-text
Sovereign · AU
ElevenLabs v3 ElevenLabs v3 voice synthesis. Multi-speaker, emotion-aware.
,
,
Proprietary
Text-to-speech
Non-sovereign
ElevenLabs Multilingual v2 29 languages, voice cloning. Production-stable.
,
,
Proprietary
Text-to-speech
Non-sovereign
Safety
1 Llama Guard 3 8B Content-safety classifier from Meta. Pre/post-filter for chat.
8B
128K
Llama Community
Safety / moderationText generation
Sovereign · AU
No models match. Try a different search term or clear filters.
Frequently asked questions
What does 'sovereign' mean on these model cards?
Sovereign models run on Amaze GPU pools in Sydney. Weights, prompts, responses and embeddings stay in Australia. Non-sovereign models are provider-hosted catalogue entries (Anthropic Claude, OpenAI GPT, Google Gemini) accessed through the Amaze control plane, same authentication, same billing entity, but inference traffic transits the provider's region. The Non-sovereign tag is there so you can opt in or out per workload.
Do you actually have over 100 models?
Yes. As of today the catalogue lists 104 models. The split is roughly two-thirds sovereign open-weight (Llama, Qwen, DeepSeek, Mistral, Gemma, Kimi, Phi, Yi, GLM, Whisper, FLUX, Stable Diffusion, BGE / Jina / Nomic / Snowflake embeddings) and one-third non-sovereign frontier models (Claude, GPT, Gemini, Grok, Perplexity Sonar, Cohere). The list updates as model providers ship.
How do I pick a model?
Three ways. By capability if you know the task, chat, code, reasoning, vision, embedding, etc. By family if you've already picked a provider, useful for migrations from another inference platform. By use case if you're scoping a workload, 'chatbot', 'RAG', 'voice agent'. The same model usually appears in multiple groupings.
Can I fine-tune any of these models?
All sovereign open-weight models can be fine-tuned on Amaze GPU pools. Non-sovereign frontier models follow the provider's fine-tuning policy. Talk to sales about specific weight access and deployment topology; many customers run their fine-tunes on dedicated AU GPU pools.
Bring your workload.
We'll find the model.
Talk to a solution architect about which models fit your latency, accuracy, sovereignty and cost envelope.