104+ MODELS · 100% AUSTRALIAN COMPUTE

100+ Models. 100% Australian Compute.

Q: What does 'sovereign' mean on these model cards?

Sovereign models run on Amaze GPU pools in Sydney. Weights, prompts, responses and embeddings stay in Australia. Non-sovereign models are provider-hosted catalogue entries (Anthropic Claude, OpenAI GPT, Google Gemini) accessed through the Amaze control plane, same authentication, same billing entity, but inference traffic transits the provider's region. The Non-sovereign tag is there so you can opt in or out per workload.

Q: How do I pick a model?

Three ways. By capability if you know the task, chat, code, reasoning, vision, embedding, etc. By family if you've already picked a provider, useful for migrations from another inference platform. By use case if you're scoping a workload, 'chatbot', 'RAG', 'voice agent'. The same model usually appears in multiple groupings.

Open-weight families served on Australian GPU pools. Frontier provider catalogue available through the same sovereign control plane. Tagged so you choose with eyes open.

Talk to Sales →

100+ MODELS · ONE SOVEREIGN STACK

Every model. One Australian API.

Browse by capability, family, or use case. Sovereign-hosted weights run in Australian regions. Non-sovereign API providers are tagged so you choose with eyes open.

104+ models available

79 sovereign-hosted in AU

22 model families

Sovereign only

Text generation

Llama 4 Maverick ★ Flagship Llama 4 MoE. 400B total / 17B active. Vision + 1M context.

400B / 17B active

Llama Community

Text generationVision

Sovereign · AU

Llama 4 Scout 10M context window. Built for long-document RAG and summarisation.

109B / 17B active

10M

Llama Community

Text generationVision

Sovereign · AU

Llama 3.3 70B Instruct Strong all-rounder Llama 3.3 70B. Function calling, multilingual.

70B

128K

Llama Community

Text generation

Sovereign · AU

Llama 3.2 11B Vision Compact vision-capable Llama. Image + text understanding.

11B

128K

Llama Community

Text generationVision

Sovereign · AU

Llama 3.2 90B Vision Large-format vision Llama for high-fidelity image reasoning.

90B

128K

Llama Community

Text generationVision

Sovereign · AU

Llama 3.2 1B Instruct Edge-deployable Llama 3.2 for on-prem or constrained workloads.

128K

Llama Community

Text generation

Sovereign · AU

Llama 3.2 3B Instruct 3B-parameter Llama for cost-efficient general inference.

128K

Llama Community

Text generation

Sovereign · AU

Llama 3.1 8B Instruct Workhorse Llama 3.1 8B. Strong instruction-following.

128K

Llama Community

Text generation

Sovereign · AU

Llama 3.1 405B Instruct Dense 405B Llama for heavyweight reasoning workloads.

405B

128K

Llama Community

Text generationReasoning

Sovereign · AU

Code Llama 70B Code Llama 70B. Fill-in-middle, repo-scale completion.

70B

100K

Llama Community

CodeText generation

Sovereign · AU

Llama Guard 3 8B Content-safety classifier from Meta. Pre/post-filter for chat.

128K

Llama Community

Safety / moderationText generation

Sovereign · AU

Qwen 3.5 397B ★ Flagship Qwen 3.5 MoE. Tops GPQA Diamond among open weights.

397B / 17B active

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Qwen 3.5 72B Instruct Dense 72B Qwen. Strong reasoning, multilingual.

72B

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Qwen 3.5 32B Instruct Mid-size Qwen 3.5 for balanced cost vs capability.

32B

128K

Apache 2.0

Text generation

Sovereign · AU

Qwen 3.5 14B Instruct Cost-efficient Qwen for high-throughput inference.

14B

128K

Apache 2.0

Text generation

Sovereign · AU

Qwen 3.5 7B Instruct Compact Qwen 7B. Fast, multilingual, easy to fine-tune.

128K

Apache 2.0

Text generation

Sovereign · AU

Qwen3-Coder 480B ★ Best-in-class open coding model. Repo-scale agent workflows.

480B / 35B active

Apache 2.0

CodeText generation

Sovereign · AU

Qwen2.5-VL 72B Vision-language Qwen. Document, chart, screen understanding.

72B

128K

Apache 2.0

VisionText generation

Sovereign · AU

Qwen2.5-VL 7B Compact Qwen-VL for high-throughput vision pipelines.

128K

Apache 2.0

VisionText generation

Sovereign · AU

QwQ 32B Reasoning Chain-of-thought reasoning Qwen. Strong on maths + logic.

32B

128K

Apache 2.0

ReasoningText generation

Sovereign · AU

Qwen2.5-Math 72B Math-specialist Qwen. Symbolic reasoning, proofs, MATH benchmark.

72B

128K

Apache 2.0

ReasoningText generation

Sovereign · AU

DeepSeek V4 Pro ★ Frontier open-weight. 80.6 SWE-Bench Verified, 90.1 GPQA Diamond.

1.6T / 49B active

MIT

Text generationReasoning

Sovereign · AU

DeepSeek V4 Flash Low-latency DeepSeek V4. Most of the smarts at a fraction of the cost.

37B / 6B active

256K

MIT

Text generationCode

Sovereign · AU

DeepSeek V3.5 Production workhorse before V4 Pro. Excellent code generation.

671B / 37B active

128K

MIT

Text generationCode

Sovereign · AU

DeepSeek R1 ★ Open reasoning model. Visible chain-of-thought traces.

671B / 37B active

128K

MIT

ReasoningText generation

Sovereign · AU

DeepSeek R1 Distill Llama 70B R1 reasoning distilled into Llama 70B. Production-friendly latency.

70B

128K

MIT

ReasoningText generation

Sovereign · AU

DeepSeek R1 Distill Qwen 32B R1 reasoning distilled into Qwen 32B. Strong cost-performance.

32B

128K

MIT

ReasoningText generation

Sovereign · AU

DeepSeek-VL2 DeepSeek vision-language. OCR + chart + screen reasoning.

27B / 4.5B active

128K

MIT

VisionText generation

Sovereign · AU

Mistral Large 3 ★ Mistral's flagship open MoE. Strong multilingual, function-calling.

675B / 41B active

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Mistral Medium 3.5 Mid-tier Mistral. Strong cost-to-quality ratio for production.

70B

128K

Apache 2.0

Text generation

Sovereign · AU

Mistral Small 3.1 Small Mistral. Edge-deployable, multilingual, fast.

24B

128K

Apache 2.0

Text generation

Sovereign · AU

Mistral Nemo 12B Joint Mistral × NVIDIA. Strong multilingual, FP8-friendly.

12B

128K

Apache 2.0

Text generation

Sovereign · AU

Mixtral 8x22B Mixtral 8x22B MoE. Cost-efficient inference at scale.

141B / 39B active

64K

Apache 2.0

Text generation

Sovereign · AU

Mixtral 8x7B Original Mixtral MoE. Still strong for general inference.

47B / 13B active

32K

Apache 2.0

Text generation

Sovereign · AU

Pixtral 12B Mistral's open vision-language model. Document + UI reasoning.

12B

128K

Apache 2.0

VisionText generation

Sovereign · AU

Pixtral Large 124B Heavyweight Pixtral. State-of-the-art open multimodal.

124B

128K

Apache 2.0

VisionText generation

Sovereign · AU

Mistral Saba 24B Arabic-specialist Mistral. Strong MENA-region language performance.

24B

32K

Apache 2.0

Text generation

Sovereign · AU

Gemma 4 31B ★ Largest open Gemma 4. Strong reasoning, function-calling.

31B

256K

Gemma

Text generationReasoning

Sovereign · AU

Gemma 4 9B Mid-size Gemma 4. Best 9B-class open model on multilingual tasks.

128K

Gemma

Text generation

Sovereign · AU

Gemma 4 2B Tiny Gemma 4. Edge / on-device deployments.

128K

Gemma

Text generation

Sovereign · AU

Gemma 2 27B Stable Gemma 2 release. Still strong for general production.

27B

Gemma

Text generation

Sovereign · AU

PaliGemma 2 28B Google's open VLM line. Document, screen, chart understanding.

28B

Gemma

VisionText generation

Sovereign · AU

Gemini 2.5 Pro Gemini 2.5 Pro via Google API. 2M context, strong vision + reasoning.

Proprietary

Text generationVision

Non-sovereign

Gemini 2.5 Flash Low-latency Gemini. Cost-efficient general inference.

Proprietary

Text generationVision

Non-sovereign

Gemini 3.1 Pro Preview Google's next-gen Gemini in preview. Reasoning + visual leadership.

Proprietary

Text generationVision

Non-sovereign

Claude Opus 4.6 ★ Anthropic's frontier model. Industry-leading on agentic tasks.

500K

Proprietary

Text generationReasoning

Non-sovereign

Claude Sonnet 4.6 Production workhorse Claude. Best quality-per-dollar for most workloads.

500K

Proprietary

Text generationReasoning

Non-sovereign

Claude Haiku 4.5 Fastest Claude. Sub-second latency, vision-capable.

200K

Proprietary

Text generationVision

Non-sovereign

Claude 3.7 Sonnet (Legacy) Previous-gen Claude. Kept available for pinned production lines.

200K

Proprietary

Text generationVision

Non-sovereign

GPT-5 ★ OpenAI's flagship. Top of leaderboards on most general benchmarks.

400K

Proprietary

Text generationReasoning

Non-sovereign

GPT-5 mini Compact GPT-5. Cost-efficient general inference.

400K

Proprietary

Text generationVision

Non-sovereign

GPT-4.1 GPT-4.1 with 1M context. Strong for long-document workflows.

Proprietary

Text generationVision

Non-sovereign

GPT-4o Multimodal GPT-4o. Stable production option for chat + vision.

128K

Proprietary

Text generationVision

Non-sovereign

o4 Reasoning OpenAI's reasoning series. Visible chain-of-thought traces.

200K

Proprietary

ReasoningText generation

Non-sovereign

Kimi K2.6 ★ Kimi K2.6, 90.5% GPQA. Strong agentic and coding capabilities.

1T / 32B active

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Kimi K2 Previous Kimi flagship. Production-stable for coding agents.

1T / 32B active

200K

Apache 2.0

Text generationCode

Sovereign · AU

Grok 4.1 xAI's frontier model. Real-time data, strong reasoning.

256K

Proprietary

Text generationReasoning

Non-sovereign

Grok 4 Grok 4 production model. Reasoning + tool use.

256K

Proprietary

Text generationReasoning

Non-sovereign

Grok 3 Grok 3 for cost-efficient inference. Open-source weights available.

128K

Proprietary

Text generation

Non-sovereign

GLM 5.1 GLM 5.1, strong open coding + agentic performance.

356B / 32B active

128K

Apache 2.0

Text generationCode

Sovereign · AU

GLM 4.5 Production GLM 4.5. Strong multilingual, function-calling.

32B

128K

Apache 2.0

Text generation

Sovereign · AU

ChatGLM 4 9B ChatGLM 4 9B. Compact, multilingual, fast.

128K

Apache 2.0

Text generation

Sovereign · AU

Yi-Large 405B Yi-Large flagship. Strong multilingual reasoning.

405B

200K

Apache 2.0

Text generationReasoning

Sovereign · AU

Yi 34B Chat Yi 34B for production. Strong cost-performance balance.

34B

200K

Apache 2.0

Text generation

Sovereign · AU

Yi-VL 34B Yi's vision-language model. Strong on chart + screen understanding.

34B

128K

Apache 2.0

VisionText generation

Sovereign · AU

Phi 4 14B Phi 4, small-model reasoning leader. MIT licence.

14B

128K

MIT

Text generationReasoning

Sovereign · AU

Phi 4 mini 3.8B Compact Phi for edge deployment. Strong instruction-following.

3.8B

128K

MIT

Text generation

Sovereign · AU

Phi 3.5 MoE Phi 3.5 MoE. Cost-efficient inference with strong quality.

42B / 6.6B active

128K

MIT

Text generation

Sovereign · AU

Phi 3.5 Vision Compact Phi vision model. Strong document + chart OCR.

4.2B

128K

MIT

VisionText generation

Sovereign · AU

Cohere Command A Cohere's flagship Command. Strong enterprise RAG + tool use.

111B

256K

Proprietary

Text generationCode

Non-sovereign

Cohere Command R+ Command R+ for enterprise RAG. Inline citation generation.

104B

128K

Proprietary

Text generation

Non-sovereign

Cohere Command R Smaller Command R. Cost-efficient RAG-tuned inference.

35B

128K

Proprietary

Text generation

Non-sovereign

Sonar Pro Online Web-grounded Sonar. Real-time citations, fresh information retrieval.

200K

Proprietary

Text generation

Non-sovereign

Sonar Online Cost-efficient Sonar with live web context.

128K

Proprietary

Text generation

Non-sovereign

Reasoning

Llama 4 Maverick ★ Flagship Llama 4 MoE. 400B total / 17B active. Vision + 1M context.

400B / 17B active

Llama Community

Text generationVision

Sovereign · AU

Llama 3.1 405B Instruct Dense 405B Llama for heavyweight reasoning workloads.

405B

128K

Llama Community

Text generationReasoning

Sovereign · AU

Qwen 3.5 397B ★ Flagship Qwen 3.5 MoE. Tops GPQA Diamond among open weights.

397B / 17B active

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Qwen 3.5 72B Instruct Dense 72B Qwen. Strong reasoning, multilingual.

72B

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

QwQ 32B Reasoning Chain-of-thought reasoning Qwen. Strong on maths + logic.

32B

128K

Apache 2.0

ReasoningText generation

Sovereign · AU

Qwen2.5-Math 72B Math-specialist Qwen. Symbolic reasoning, proofs, MATH benchmark.

72B

128K

Apache 2.0

ReasoningText generation

Sovereign · AU

DeepSeek V4 Pro ★ Frontier open-weight. 80.6 SWE-Bench Verified, 90.1 GPQA Diamond.

1.6T / 49B active

MIT

Text generationReasoning

Sovereign · AU

DeepSeek R1 ★ Open reasoning model. Visible chain-of-thought traces.

671B / 37B active

128K

MIT

ReasoningText generation

Sovereign · AU

DeepSeek R1 Distill Llama 70B R1 reasoning distilled into Llama 70B. Production-friendly latency.

70B

128K

MIT

ReasoningText generation

Sovereign · AU

DeepSeek R1 Distill Qwen 32B R1 reasoning distilled into Qwen 32B. Strong cost-performance.

32B

128K

MIT

ReasoningText generation

Sovereign · AU

DeepSeek-Math 7B Math-specialist DeepSeek. Compact, fast for symbolic work.

32K

MIT

Reasoning

Sovereign · AU

Mistral Large 3 ★ Mistral's flagship open MoE. Strong multilingual, function-calling.

675B / 41B active

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Gemma 4 31B ★ Largest open Gemma 4. Strong reasoning, function-calling.

31B

256K

Gemma

Text generationReasoning

Sovereign · AU

Gemini 2.5 Pro Gemini 2.5 Pro via Google API. 2M context, strong vision + reasoning.

Proprietary

Text generationVision

Non-sovereign

Gemini 3.1 Pro Preview Google's next-gen Gemini in preview. Reasoning + visual leadership.

Proprietary

Text generationVision

Non-sovereign

Claude Opus 4.6 ★ Anthropic's frontier model. Industry-leading on agentic tasks.

500K

Proprietary

Text generationReasoning

Non-sovereign

Claude Sonnet 4.6 Production workhorse Claude. Best quality-per-dollar for most workloads.

500K

Proprietary

Text generationReasoning

Non-sovereign

GPT-5 ★ OpenAI's flagship. Top of leaderboards on most general benchmarks.

400K

Proprietary

Text generationReasoning

Non-sovereign

o4 Reasoning OpenAI's reasoning series. Visible chain-of-thought traces.

200K

Proprietary

ReasoningText generation

Non-sovereign

o4 mini Compact o4 reasoning. Faster, cheaper for routine reasoning workloads.

200K

Proprietary

Reasoning

Non-sovereign

Kimi K2.6 ★ Kimi K2.6, 90.5% GPQA. Strong agentic and coding capabilities.

1T / 32B active

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Grok 4.1 xAI's frontier model. Real-time data, strong reasoning.

256K

Proprietary

Text generationReasoning

Non-sovereign

Grok 4 Grok 4 production model. Reasoning + tool use.

256K

Proprietary

Text generationReasoning

Non-sovereign

Yi-Large 405B Yi-Large flagship. Strong multilingual reasoning.

405B

200K

Apache 2.0

Text generationReasoning

Sovereign · AU

Phi 4 14B Phi 4, small-model reasoning leader. MIT licence.

14B

128K

MIT

Text generationReasoning

Sovereign · AU

Code

Code Llama 70B Code Llama 70B. Fill-in-middle, repo-scale completion.

70B

100K

Llama Community

CodeText generation

Sovereign · AU

Qwen3-Coder 480B ★ Best-in-class open coding model. Repo-scale agent workflows.

480B / 35B active

Apache 2.0

CodeText generation

Sovereign · AU

Qwen3-Coder 30B Workhorse coding model for IDE-integrated assistants.

30B

256K

Apache 2.0

Code

Sovereign · AU

DeepSeek V4 Pro ★ Frontier open-weight. 80.6 SWE-Bench Verified, 90.1 GPQA Diamond.

1.6T / 49B active

MIT

Text generationReasoning

Sovereign · AU

DeepSeek V4 Flash Low-latency DeepSeek V4. Most of the smarts at a fraction of the cost.

37B / 6B active

256K

MIT

Text generationCode

Sovereign · AU

DeepSeek V3.5 Production workhorse before V4 Pro. Excellent code generation.

671B / 37B active

128K

MIT

Text generationCode

Sovereign · AU

DeepSeek-Coder V3 Specialist coding DeepSeek. Repo-scale, fill-in-middle.

236B / 21B active

128K

MIT

Code

Sovereign · AU

Codestral 22B Mistral's coding specialist. Multi-language, fill-in-middle.

22B

32K

Apache 2.0

Code

Sovereign · AU

Codestral Mamba 7B Mamba-architecture coding model. Linear-time long-context inference.

256K

Apache 2.0

Code

Sovereign · AU

CodeGemma 7B Code-specialist Gemma. Fast, fill-in-middle, IDE-friendly.

Gemma

Code

Sovereign · AU

Gemini 2.5 Pro Gemini 2.5 Pro via Google API. 2M context, strong vision + reasoning.

Proprietary

Text generationVision

Non-sovereign

Claude Opus 4.6 ★ Anthropic's frontier model. Industry-leading on agentic tasks.

500K

Proprietary

Text generationReasoning

Non-sovereign

Claude Sonnet 4.6 Production workhorse Claude. Best quality-per-dollar for most workloads.

500K

Proprietary

Text generationReasoning

Non-sovereign

GPT-5 ★ OpenAI's flagship. Top of leaderboards on most general benchmarks.

400K

Proprietary

Text generationReasoning

Non-sovereign

GPT-4.1 GPT-4.1 with 1M context. Strong for long-document workflows.

Proprietary

Text generationVision

Non-sovereign

Kimi K2.6 ★ Kimi K2.6, 90.5% GPQA. Strong agentic and coding capabilities.

1T / 32B active

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Kimi K2 Previous Kimi flagship. Production-stable for coding agents.

1T / 32B active

200K

Apache 2.0

Text generationCode

Sovereign · AU

GLM 5.1 GLM 5.1, strong open coding + agentic performance.

356B / 32B active

128K

Apache 2.0

Text generationCode

Sovereign · AU

Cohere Command A Cohere's flagship Command. Strong enterprise RAG + tool use.

111B

256K

Proprietary

Text generationCode

Non-sovereign

Vision

Llama 4 Maverick ★ Flagship Llama 4 MoE. 400B total / 17B active. Vision + 1M context.

400B / 17B active

Llama Community

Text generationVision

Sovereign · AU

Llama 4 Scout 10M context window. Built for long-document RAG and summarisation.

109B / 17B active

10M

Llama Community

Text generationVision

Sovereign · AU

Llama 3.2 11B Vision Compact vision-capable Llama. Image + text understanding.

11B

128K

Llama Community

Text generationVision

Sovereign · AU

Llama 3.2 90B Vision Large-format vision Llama for high-fidelity image reasoning.

90B

128K

Llama Community

Text generationVision

Sovereign · AU

Qwen2.5-VL 72B Vision-language Qwen. Document, chart, screen understanding.

72B

128K

Apache 2.0

VisionText generation

Sovereign · AU

Qwen2.5-VL 7B Compact Qwen-VL for high-throughput vision pipelines.

128K

Apache 2.0

VisionText generation

Sovereign · AU

DeepSeek-VL2 DeepSeek vision-language. OCR + chart + screen reasoning.

27B / 4.5B active

128K

MIT

VisionText generation

Sovereign · AU

Pixtral 12B Mistral's open vision-language model. Document + UI reasoning.

12B

128K

Apache 2.0

VisionText generation

Sovereign · AU

Pixtral Large 124B Heavyweight Pixtral. State-of-the-art open multimodal.

124B

128K

Apache 2.0

VisionText generation

Sovereign · AU

PaliGemma 2 28B Google's open VLM line. Document, screen, chart understanding.

28B

Gemma

VisionText generation

Sovereign · AU

Gemini 2.5 Pro Gemini 2.5 Pro via Google API. 2M context, strong vision + reasoning.

Proprietary

Text generationVision

Non-sovereign

Gemini 2.5 Flash Low-latency Gemini. Cost-efficient general inference.

Proprietary

Text generationVision

Non-sovereign

Gemini 3.1 Pro Preview Google's next-gen Gemini in preview. Reasoning + visual leadership.

Proprietary

Text generationVision

Non-sovereign

Claude Opus 4.6 ★ Anthropic's frontier model. Industry-leading on agentic tasks.

500K

Proprietary

Text generationReasoning

Non-sovereign

Claude Sonnet 4.6 Production workhorse Claude. Best quality-per-dollar for most workloads.

500K

Proprietary

Text generationReasoning

Non-sovereign

Claude Haiku 4.5 Fastest Claude. Sub-second latency, vision-capable.

200K

Proprietary

Text generationVision

Non-sovereign

Claude 3.7 Sonnet (Legacy) Previous-gen Claude. Kept available for pinned production lines.

200K

Proprietary

Text generationVision

Non-sovereign

GPT-5 ★ OpenAI's flagship. Top of leaderboards on most general benchmarks.

400K

Proprietary

Text generationReasoning

Non-sovereign

GPT-5 mini Compact GPT-5. Cost-efficient general inference.

400K

Proprietary

Text generationVision

Non-sovereign

GPT-4.1 GPT-4.1 with 1M context. Strong for long-document workflows.

Proprietary

Text generationVision

Non-sovereign

GPT-4o Multimodal GPT-4o. Stable production option for chat + vision.

128K

Proprietary

Text generationVision

Non-sovereign

Grok 4.1 xAI's frontier model. Real-time data, strong reasoning.

256K

Proprietary

Text generationReasoning

Non-sovereign

Yi-VL 34B Yi's vision-language model. Strong on chart + screen understanding.

34B

128K

Apache 2.0

VisionText generation

Sovereign · AU

Phi 3.5 Vision Compact Phi vision model. Strong document + chart OCR.

4.2B

128K

MIT

VisionText generation

Sovereign · AU

Image generation

Stable Diffusion 3.5 Large SD 3.5 Large. Photoreal output, strong prompt adherence.

Proprietary

Image generation

Sovereign · AU

Stable Diffusion 3.5 Medium SD 3.5 Medium. Balanced quality vs throughput.

2.5B

Proprietary

Image generation

Sovereign · AU

SDXL 1.0 SDXL 1.0, stable, ecosystem-rich. ControlNet + LoRA support.

3.5B

Proprietary

Image generation

Sovereign · AU

FLUX.1 [pro] ★ FLUX.1 [pro], top-of-class image generation.

12B

Proprietary

Image generation

Sovereign · AU

FLUX.1 [dev] Open FLUX dev. Non-commercial fine-tuning permitted.

12B

Apache 2.0

Image generation

Sovereign · AU

FLUX.1 [schnell] Fast 4-step FLUX. Real-time interactive generation.

12B

Apache 2.0

Image generation

Sovereign · AU

Video

Stable Video Diffusion 1.1 Image-to-video diffusion. Short clip generation from a still.

1.5B

Proprietary

Video

Sovereign · AU

Speech-to-text

Whisper Large v3 Whisper Large v3. Multilingual transcription, gold standard.

1.55B

MIT

Speech-to-text

Sovereign · AU

Whisper Large v3 Turbo Faster Whisper Large. ~5x throughput at near-identical accuracy.

809M

MIT

Speech-to-text

Sovereign · AU

Distil-Whisper Large v3 Distilled Whisper. 6x faster, English-only.

756M

MIT

Speech-to-text

Sovereign · AU

Text-to-speech

ElevenLabs v3 ElevenLabs v3 voice synthesis. Multi-speaker, emotion-aware.

Proprietary

Text-to-speech

Non-sovereign

ElevenLabs Multilingual v2 29 languages, voice cloning. Production-stable.

Proprietary

Text-to-speech

Non-sovereign

Embeddings

BGE Large EN v1.5 BGE Large, top open English embedding model.

335M

512

MIT

Embeddings

Sovereign · AU

BGE M3 Multilingual + multifunctional. Dense, sparse, multi-vector retrieval.

560M

MIT

Embeddings

Sovereign · AU

Snowflake Arctic Embed L Snowflake Arctic Embed, strong on retrieval benchmarks.

335M

512

Apache 2.0

Embeddings

Sovereign · AU

Snowflake Arctic Embed M v2 Cost-efficient Snowflake embed. Long-context variant.

305M

Apache 2.0

Embeddings

Sovereign · AU

Nomic Embed Text v1.5 Nomic Embed, open weights, Matryoshka representation.

137M

Apache 2.0

Embeddings

Sovereign · AU

Nomic Embed Vision v1.5 Multimodal embeddings. Joint text + image space.

92M

Apache 2.0

Embeddings

Sovereign · AU

Jina Embeddings v3 Jina v3, multilingual, task-specific LoRA adapters.

570M

Apache 2.0

Embeddings

Sovereign · AU

Cohere Embed v4 Cohere Embed v4. Multimodal, multilingual, image + text retrieval.

Proprietary

Embeddings

Non-sovereign

Rerank

BGE Reranker v2 M3 Multilingual cross-encoder reranker. Drop-in RAG quality boost.

560M

MIT

Rerank

Sovereign · AU

Jina Reranker v2 Jina cross-encoder reranker. Multilingual.

278M

Apache 2.0

Rerank

Sovereign · AU

Cohere Rerank v3.5 Cohere Rerank, enterprise gold standard for RAG quality.

Proprietary

Rerank

Non-sovereign

Safety / moderation

Llama Guard 3 8B Content-safety classifier from Meta. Pre/post-filter for chat.

128K

Llama Community

Safety / moderationText generation

Sovereign · AU

Qwen

Qwen 3.5 397B ★ Flagship Qwen 3.5 MoE. Tops GPQA Diamond among open weights.

397B / 17B active

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Qwen 3.5 72B Instruct Dense 72B Qwen. Strong reasoning, multilingual.

72B

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Qwen 3.5 32B Instruct Mid-size Qwen 3.5 for balanced cost vs capability.

32B

128K

Apache 2.0

Text generation

Sovereign · AU

Qwen 3.5 14B Instruct Cost-efficient Qwen for high-throughput inference.

14B

128K

Apache 2.0

Text generation

Sovereign · AU

Qwen 3.5 7B Instruct Compact Qwen 7B. Fast, multilingual, easy to fine-tune.

128K

Apache 2.0

Text generation

Sovereign · AU

Qwen3-Coder 480B ★ Best-in-class open coding model. Repo-scale agent workflows.

480B / 35B active

Apache 2.0

CodeText generation

Sovereign · AU

Qwen3-Coder 30B Workhorse coding model for IDE-integrated assistants.

30B

256K

Apache 2.0

Code

Sovereign · AU

Qwen2.5-VL 72B Vision-language Qwen. Document, chart, screen understanding.

72B

128K

Apache 2.0

VisionText generation

Sovereign · AU

Qwen2.5-VL 7B Compact Qwen-VL for high-throughput vision pipelines.

128K

Apache 2.0

VisionText generation

Sovereign · AU

QwQ 32B Reasoning Chain-of-thought reasoning Qwen. Strong on maths + logic.

32B

128K

Apache 2.0

ReasoningText generation

Sovereign · AU

Qwen2.5-Math 72B Math-specialist Qwen. Symbolic reasoning, proofs, MATH benchmark.

72B

128K

Apache 2.0

ReasoningText generation

Sovereign · AU

DeepSeek

DeepSeek V4 Pro ★ Frontier open-weight. 80.6 SWE-Bench Verified, 90.1 GPQA Diamond.

1.6T / 49B active

MIT

Text generationReasoning

Sovereign · AU

DeepSeek V4 Flash Low-latency DeepSeek V4. Most of the smarts at a fraction of the cost.

37B / 6B active

256K

MIT

Text generationCode

Sovereign · AU

DeepSeek V3.5 Production workhorse before V4 Pro. Excellent code generation.

671B / 37B active

128K

MIT

Text generationCode

Sovereign · AU

DeepSeek R1 ★ Open reasoning model. Visible chain-of-thought traces.

671B / 37B active

128K

MIT

ReasoningText generation

Sovereign · AU

DeepSeek R1 Distill Llama 70B R1 reasoning distilled into Llama 70B. Production-friendly latency.

70B

128K

MIT

ReasoningText generation

Sovereign · AU

DeepSeek R1 Distill Qwen 32B R1 reasoning distilled into Qwen 32B. Strong cost-performance.

32B

128K

MIT

ReasoningText generation

Sovereign · AU

DeepSeek-Coder V3 Specialist coding DeepSeek. Repo-scale, fill-in-middle.

236B / 21B active

128K

MIT

Code

Sovereign · AU

DeepSeek-Math 7B Math-specialist DeepSeek. Compact, fast for symbolic work.

32K

MIT

Reasoning

Sovereign · AU

DeepSeek-VL2 DeepSeek vision-language. OCR + chart + screen reasoning.

27B / 4.5B active

128K

MIT

VisionText generation

Sovereign · AU

Mistral

Mistral Large 3 ★ Mistral's flagship open MoE. Strong multilingual, function-calling.

675B / 41B active

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Mistral Medium 3.5 Mid-tier Mistral. Strong cost-to-quality ratio for production.

70B

128K

Apache 2.0

Text generation

Sovereign · AU

Mistral Small 3.1 Small Mistral. Edge-deployable, multilingual, fast.

24B

128K

Apache 2.0

Text generation

Sovereign · AU

Mistral Nemo 12B Joint Mistral × NVIDIA. Strong multilingual, FP8-friendly.

12B

128K

Apache 2.0

Text generation

Sovereign · AU

Mixtral 8x22B Mixtral 8x22B MoE. Cost-efficient inference at scale.

141B / 39B active

64K

Apache 2.0

Text generation

Sovereign · AU

Mixtral 8x7B Original Mixtral MoE. Still strong for general inference.

47B / 13B active

32K

Apache 2.0

Text generation

Sovereign · AU

Codestral 22B Mistral's coding specialist. Multi-language, fill-in-middle.

22B

32K

Apache 2.0

Code

Sovereign · AU

Codestral Mamba 7B Mamba-architecture coding model. Linear-time long-context inference.

256K

Apache 2.0

Code

Sovereign · AU

Pixtral 12B Mistral's open vision-language model. Document + UI reasoning.

12B

128K

Apache 2.0

VisionText generation

Sovereign · AU

Pixtral Large 124B Heavyweight Pixtral. State-of-the-art open multimodal.

124B

128K

Apache 2.0

VisionText generation

Sovereign · AU

Mistral Saba 24B Arabic-specialist Mistral. Strong MENA-region language performance.

24B

32K

Apache 2.0

Text generation

Sovereign · AU

Google

Gemma 4 31B ★ Largest open Gemma 4. Strong reasoning, function-calling.

31B

256K

Gemma

Text generationReasoning

Sovereign · AU

Gemma 4 9B Mid-size Gemma 4. Best 9B-class open model on multilingual tasks.

128K

Gemma

Text generation

Sovereign · AU

Gemma 4 2B Tiny Gemma 4. Edge / on-device deployments.

128K

Gemma

Text generation

Sovereign · AU

Gemma 2 27B Stable Gemma 2 release. Still strong for general production.

27B

Gemma

Text generation

Sovereign · AU

CodeGemma 7B Code-specialist Gemma. Fast, fill-in-middle, IDE-friendly.

Gemma

Code

Sovereign · AU

PaliGemma 2 28B Google's open VLM line. Document, screen, chart understanding.

28B

Gemma

VisionText generation

Sovereign · AU

Gemini 2.5 Pro Gemini 2.5 Pro via Google API. 2M context, strong vision + reasoning.

Proprietary

Text generationVision

Non-sovereign

Gemini 2.5 Flash Low-latency Gemini. Cost-efficient general inference.

Proprietary

Text generationVision

Non-sovereign

Gemini 3.1 Pro Preview Google's next-gen Gemini in preview. Reasoning + visual leadership.

Proprietary

Text generationVision

Non-sovereign

Anthropic

Claude Opus 4.6 ★ Anthropic's frontier model. Industry-leading on agentic tasks.

500K

Proprietary

Text generationReasoning

Non-sovereign

Claude Sonnet 4.6 Production workhorse Claude. Best quality-per-dollar for most workloads.

500K

Proprietary

Text generationReasoning

Non-sovereign

Claude Haiku 4.5 Fastest Claude. Sub-second latency, vision-capable.

200K

Proprietary

Text generationVision

Non-sovereign

Claude 3.7 Sonnet (Legacy) Previous-gen Claude. Kept available for pinned production lines.

200K

Proprietary

Text generationVision

Non-sovereign

OpenAI

GPT-5 ★ OpenAI's flagship. Top of leaderboards on most general benchmarks.

400K

Proprietary

Text generationReasoning

Non-sovereign

GPT-5 mini Compact GPT-5. Cost-efficient general inference.

400K

Proprietary

Text generationVision

Non-sovereign

GPT-4.1 GPT-4.1 with 1M context. Strong for long-document workflows.

Proprietary

Text generationVision

Non-sovereign

GPT-4o Multimodal GPT-4o. Stable production option for chat + vision.

128K

Proprietary

Text generationVision

Non-sovereign

o4 Reasoning OpenAI's reasoning series. Visible chain-of-thought traces.

200K

Proprietary

ReasoningText generation

Non-sovereign

o4 mini Compact o4 reasoning. Faster, cheaper for routine reasoning workloads.

200K

Proprietary

Reasoning

Non-sovereign

Kimi

Kimi K2.6 ★ Kimi K2.6, 90.5% GPQA. Strong agentic and coding capabilities.

1T / 32B active

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Kimi K2 Previous Kimi flagship. Production-stable for coding agents.

1T / 32B active

200K

Apache 2.0

Text generationCode

Sovereign · AU

xAI

Grok 4.1 xAI's frontier model. Real-time data, strong reasoning.

256K

Proprietary

Text generationReasoning

Non-sovereign

Grok 4 Grok 4 production model. Reasoning + tool use.

256K

Proprietary

Text generationReasoning

Non-sovereign

Grok 3 Grok 3 for cost-efficient inference. Open-source weights available.

128K

Proprietary

Text generation

Non-sovereign

Zhipu

GLM 5.1 GLM 5.1, strong open coding + agentic performance.

356B / 32B active

128K

Apache 2.0

Text generationCode

Sovereign · AU

GLM 4.5 Production GLM 4.5. Strong multilingual, function-calling.

32B

128K

Apache 2.0

Text generation

Sovereign · AU

ChatGLM 4 9B ChatGLM 4 9B. Compact, multilingual, fast.

128K

Apache 2.0

Text generation

Sovereign · AU

Yi

Yi-Large 405B Yi-Large flagship. Strong multilingual reasoning.

405B

200K

Apache 2.0

Text generationReasoning

Sovereign · AU

Yi 34B Chat Yi 34B for production. Strong cost-performance balance.

34B

200K

Apache 2.0

Text generation

Sovereign · AU

Yi-VL 34B Yi's vision-language model. Strong on chart + screen understanding.

34B

128K

Apache 2.0

VisionText generation

Sovereign · AU

Microsoft

Phi 4 14B Phi 4, small-model reasoning leader. MIT licence.

14B

128K

MIT

Text generationReasoning

Sovereign · AU

Phi 4 mini 3.8B Compact Phi for edge deployment. Strong instruction-following.

3.8B

128K

MIT

Text generation

Sovereign · AU

Phi 3.5 MoE Phi 3.5 MoE. Cost-efficient inference with strong quality.

42B / 6.6B active

128K

MIT

Text generation

Sovereign · AU

Phi 3.5 Vision Compact Phi vision model. Strong document + chart OCR.

4.2B

128K

MIT

VisionText generation

Sovereign · AU

Cohere

Cohere Command A Cohere's flagship Command. Strong enterprise RAG + tool use.

111B

256K

Proprietary

Text generationCode

Non-sovereign

Cohere Command R+ Command R+ for enterprise RAG. Inline citation generation.

104B

128K

Proprietary

Text generation

Non-sovereign

Cohere Command R Smaller Command R. Cost-efficient RAG-tuned inference.

35B

128K

Proprietary

Text generation

Non-sovereign

Cohere Embed v4 Cohere Embed v4. Multimodal, multilingual, image + text retrieval.

Proprietary

Embeddings

Non-sovereign

Cohere Rerank v3.5 Cohere Rerank, enterprise gold standard for RAG quality.

Proprietary

Rerank

Non-sovereign

Stability AI

Stable Diffusion 3.5 Large SD 3.5 Large. Photoreal output, strong prompt adherence.

Proprietary

Image generation

Sovereign · AU

Stable Diffusion 3.5 Medium SD 3.5 Medium. Balanced quality vs throughput.

2.5B

Proprietary

Image generation

Sovereign · AU

Stable Video Diffusion 1.1 Image-to-video diffusion. Short clip generation from a still.

1.5B

Proprietary

Video

Sovereign · AU

SDXL 1.0 SDXL 1.0, stable, ecosystem-rich. ControlNet + LoRA support.

3.5B

Proprietary

Image generation

Sovereign · AU

Black Forest Labs

FLUX.1 [pro] ★ FLUX.1 [pro], top-of-class image generation.

12B

Proprietary

Image generation

Sovereign · AU

FLUX.1 [dev] Open FLUX dev. Non-commercial fine-tuning permitted.

12B

Apache 2.0

Image generation

Sovereign · AU

FLUX.1 [schnell] Fast 4-step FLUX. Real-time interactive generation.

12B

Apache 2.0

Image generation

Sovereign · AU

Whisper

Whisper Large v3 Whisper Large v3. Multilingual transcription, gold standard.

1.55B

MIT

Speech-to-text

Sovereign · AU

Whisper Large v3 Turbo Faster Whisper Large. ~5x throughput at near-identical accuracy.

809M

MIT

Speech-to-text

Sovereign · AU

Distil-Whisper Large v3 Distilled Whisper. 6x faster, English-only.

756M

MIT

Speech-to-text

Sovereign · AU

ElevenLabs

ElevenLabs v3 ElevenLabs v3 voice synthesis. Multi-speaker, emotion-aware.

Proprietary

Text-to-speech

Non-sovereign

ElevenLabs Multilingual v2 29 languages, voice cloning. Production-stable.

Proprietary

Text-to-speech

Non-sovereign

BGE

BGE Large EN v1.5 BGE Large, top open English embedding model.

335M

512

MIT

Embeddings

Sovereign · AU

BGE M3 Multilingual + multifunctional. Dense, sparse, multi-vector retrieval.

560M

MIT

Embeddings

Sovereign · AU

BGE Reranker v2 M3 Multilingual cross-encoder reranker. Drop-in RAG quality boost.

560M

MIT

Rerank

Sovereign · AU

Snowflake

Snowflake Arctic Embed L Snowflake Arctic Embed, strong on retrieval benchmarks.

335M

512

Apache 2.0

Embeddings

Sovereign · AU

Snowflake Arctic Embed M v2 Cost-efficient Snowflake embed. Long-context variant.

305M

Apache 2.0

Embeddings

Sovereign · AU

Nomic

Nomic Embed Text v1.5 Nomic Embed, open weights, Matryoshka representation.

137M

Apache 2.0

Embeddings

Sovereign · AU

Nomic Embed Vision v1.5 Multimodal embeddings. Joint text + image space.

92M

Apache 2.0

Embeddings

Sovereign · AU

Jina

Jina Embeddings v3 Jina v3, multilingual, task-specific LoRA adapters.

570M

Apache 2.0

Embeddings

Sovereign · AU

Jina Reranker v2 Jina cross-encoder reranker. Multilingual.

278M

Apache 2.0

Rerank

Sovereign · AU

Perplexity

Sonar Pro Online Web-grounded Sonar. Real-time citations, fresh information retrieval.

200K

Proprietary

Text generation

Non-sovereign

Sonar Online Cost-efficient Sonar with live web context.

128K

Proprietary

Text generation

Non-sovereign

Chatbots & assistants

Llama 4 Maverick ★ Flagship Llama 4 MoE. 400B total / 17B active. Vision + 1M context.

400B / 17B active

Llama Community

Text generationVision

Sovereign · AU

Llama 4 Scout 10M context window. Built for long-document RAG and summarisation.

109B / 17B active

10M

Llama Community

Text generationVision

Sovereign · AU

Llama 3.3 70B Instruct Strong all-rounder Llama 3.3 70B. Function calling, multilingual.

70B

128K

Llama Community

Text generation

Sovereign · AU

Llama 3.2 1B Instruct Edge-deployable Llama 3.2 for on-prem or constrained workloads.

128K

Llama Community

Text generation

Sovereign · AU

Llama 3.2 3B Instruct 3B-parameter Llama for cost-efficient general inference.

128K

Llama Community

Text generation

Sovereign · AU

Llama 3.1 8B Instruct Workhorse Llama 3.1 8B. Strong instruction-following.

128K

Llama Community

Text generation

Sovereign · AU

Qwen 3.5 32B Instruct Mid-size Qwen 3.5 for balanced cost vs capability.

32B

128K

Apache 2.0

Text generation

Sovereign · AU

Qwen 3.5 14B Instruct Cost-efficient Qwen for high-throughput inference.

14B

128K

Apache 2.0

Text generation

Sovereign · AU

Qwen 3.5 7B Instruct Compact Qwen 7B. Fast, multilingual, easy to fine-tune.

128K

Apache 2.0

Text generation

Sovereign · AU

DeepSeek V4 Flash Low-latency DeepSeek V4. Most of the smarts at a fraction of the cost.

37B / 6B active

256K

MIT

Text generationCode

Sovereign · AU

Mistral Large 3 ★ Mistral's flagship open MoE. Strong multilingual, function-calling.

675B / 41B active

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Mistral Medium 3.5 Mid-tier Mistral. Strong cost-to-quality ratio for production.

70B

128K

Apache 2.0

Text generation

Sovereign · AU

Mistral Small 3.1 Small Mistral. Edge-deployable, multilingual, fast.

24B

128K

Apache 2.0

Text generation

Sovereign · AU

Mistral Nemo 12B Joint Mistral × NVIDIA. Strong multilingual, FP8-friendly.

12B

128K

Apache 2.0

Text generation

Sovereign · AU

Mixtral 8x22B Mixtral 8x22B MoE. Cost-efficient inference at scale.

141B / 39B active

64K

Apache 2.0

Text generation

Sovereign · AU

Mixtral 8x7B Original Mixtral MoE. Still strong for general inference.

47B / 13B active

32K

Apache 2.0

Text generation

Sovereign · AU

Mistral Saba 24B Arabic-specialist Mistral. Strong MENA-region language performance.

24B

32K

Apache 2.0

Text generation

Sovereign · AU

Gemma 4 31B ★ Largest open Gemma 4. Strong reasoning, function-calling.

31B

256K

Gemma

Text generationReasoning

Sovereign · AU

Gemma 4 9B Mid-size Gemma 4. Best 9B-class open model on multilingual tasks.

128K

Gemma

Text generation

Sovereign · AU

Gemma 4 2B Tiny Gemma 4. Edge / on-device deployments.

128K

Gemma

Text generation

Sovereign · AU

Gemma 2 27B Stable Gemma 2 release. Still strong for general production.

27B

Gemma

Text generation

Sovereign · AU

Gemini 2.5 Flash Low-latency Gemini. Cost-efficient general inference.

Proprietary

Text generationVision

Non-sovereign

Claude Sonnet 4.6 Production workhorse Claude. Best quality-per-dollar for most workloads.

500K

Proprietary

Text generationReasoning

Non-sovereign

Claude Haiku 4.5 Fastest Claude. Sub-second latency, vision-capable.

200K

Proprietary

Text generationVision

Non-sovereign

Claude 3.7 Sonnet (Legacy) Previous-gen Claude. Kept available for pinned production lines.

200K

Proprietary

Text generationVision

Non-sovereign

GPT-5 mini Compact GPT-5. Cost-efficient general inference.

400K

Proprietary

Text generationVision

Non-sovereign

GPT-4.1 GPT-4.1 with 1M context. Strong for long-document workflows.

Proprietary

Text generationVision

Non-sovereign

GPT-4o Multimodal GPT-4o. Stable production option for chat + vision.

128K

Proprietary

Text generationVision

Non-sovereign

Grok 4.1 xAI's frontier model. Real-time data, strong reasoning.

256K

Proprietary

Text generationReasoning

Non-sovereign

Grok 4 Grok 4 production model. Reasoning + tool use.

256K

Proprietary

Text generationReasoning

Non-sovereign

Grok 3 Grok 3 for cost-efficient inference. Open-source weights available.

128K

Proprietary

Text generation

Non-sovereign

GLM 4.5 Production GLM 4.5. Strong multilingual, function-calling.

32B

128K

Apache 2.0

Text generation

Sovereign · AU

ChatGLM 4 9B ChatGLM 4 9B. Compact, multilingual, fast.

128K

Apache 2.0

Text generation

Sovereign · AU

Yi-Large 405B Yi-Large flagship. Strong multilingual reasoning.

405B

200K

Apache 2.0

Text generationReasoning

Sovereign · AU

Yi 34B Chat Yi 34B for production. Strong cost-performance balance.

34B

200K

Apache 2.0

Text generation

Sovereign · AU

Phi 4 14B Phi 4, small-model reasoning leader. MIT licence.

14B

128K

MIT

Text generationReasoning

Sovereign · AU

Phi 4 mini 3.8B Compact Phi for edge deployment. Strong instruction-following.

3.8B

128K

MIT

Text generation

Sovereign · AU

Phi 3.5 MoE Phi 3.5 MoE. Cost-efficient inference with strong quality.

42B / 6.6B active

128K

MIT

Text generation

Sovereign · AU

Cohere Command A Cohere's flagship Command. Strong enterprise RAG + tool use.

111B

256K

Proprietary

Text generationCode

Non-sovereign

Sonar Pro Online Web-grounded Sonar. Real-time citations, fresh information retrieval.

200K

Proprietary

Text generation

Non-sovereign

Sonar Online Cost-efficient Sonar with live web context.

128K

Proprietary

Text generation

Non-sovereign

Agents

Llama 4 Maverick ★ Flagship Llama 4 MoE. 400B total / 17B active. Vision + 1M context.

400B / 17B active

Llama Community

Text generationVision

Sovereign · AU

Llama 3.3 70B Instruct Strong all-rounder Llama 3.3 70B. Function calling, multilingual.

70B

128K

Llama Community

Text generation

Sovereign · AU

Llama 3.2 3B Instruct 3B-parameter Llama for cost-efficient general inference.

128K

Llama Community

Text generation

Sovereign · AU

Llama 3.1 8B Instruct Workhorse Llama 3.1 8B. Strong instruction-following.

128K

Llama Community

Text generation

Sovereign · AU

Llama 3.1 405B Instruct Dense 405B Llama for heavyweight reasoning workloads.

405B

128K

Llama Community

Text generationReasoning

Sovereign · AU

Qwen 3.5 397B ★ Flagship Qwen 3.5 MoE. Tops GPQA Diamond among open weights.

397B / 17B active

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Qwen 3.5 72B Instruct Dense 72B Qwen. Strong reasoning, multilingual.

72B

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Qwen 3.5 32B Instruct Mid-size Qwen 3.5 for balanced cost vs capability.

32B

128K

Apache 2.0

Text generation

Sovereign · AU

Qwen3-Coder 480B ★ Best-in-class open coding model. Repo-scale agent workflows.

480B / 35B active

Apache 2.0

CodeText generation

Sovereign · AU

DeepSeek V4 Pro ★ Frontier open-weight. 80.6 SWE-Bench Verified, 90.1 GPQA Diamond.

1.6T / 49B active

MIT

Text generationReasoning

Sovereign · AU

DeepSeek V4 Flash Low-latency DeepSeek V4. Most of the smarts at a fraction of the cost.

37B / 6B active

256K

MIT

Text generationCode

Sovereign · AU

DeepSeek V3.5 Production workhorse before V4 Pro. Excellent code generation.

671B / 37B active

128K

MIT

Text generationCode

Sovereign · AU

Mistral Large 3 ★ Mistral's flagship open MoE. Strong multilingual, function-calling.

675B / 41B active

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Mistral Medium 3.5 Mid-tier Mistral. Strong cost-to-quality ratio for production.

70B

128K

Apache 2.0

Text generation

Sovereign · AU

Mixtral 8x22B Mixtral 8x22B MoE. Cost-efficient inference at scale.

141B / 39B active

64K

Apache 2.0

Text generation

Sovereign · AU

Gemini 2.5 Pro Gemini 2.5 Pro via Google API. 2M context, strong vision + reasoning.

Proprietary

Text generationVision

Non-sovereign

Gemini 2.5 Flash Low-latency Gemini. Cost-efficient general inference.

Proprietary

Text generationVision

Non-sovereign

Gemini 3.1 Pro Preview Google's next-gen Gemini in preview. Reasoning + visual leadership.

Proprietary

Text generationVision

Non-sovereign

Claude Opus 4.6 ★ Anthropic's frontier model. Industry-leading on agentic tasks.

500K

Proprietary

Text generationReasoning

Non-sovereign

Claude Sonnet 4.6 Production workhorse Claude. Best quality-per-dollar for most workloads.

500K

Proprietary

Text generationReasoning

Non-sovereign

GPT-5 ★ OpenAI's flagship. Top of leaderboards on most general benchmarks.

400K

Proprietary

Text generationReasoning

Non-sovereign

GPT-5 mini Compact GPT-5. Cost-efficient general inference.

400K

Proprietary

Text generationVision

Non-sovereign

Kimi K2.6 ★ Kimi K2.6, 90.5% GPQA. Strong agentic and coding capabilities.

1T / 32B active

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Kimi K2 Previous Kimi flagship. Production-stable for coding agents.

1T / 32B active

200K

Apache 2.0

Text generationCode

Sovereign · AU

GLM 5.1 GLM 5.1, strong open coding + agentic performance.

356B / 32B active

128K

Apache 2.0

Text generationCode

Sovereign · AU

Cohere Command A Cohere's flagship Command. Strong enterprise RAG + tool use.

111B

256K

Proprietary

Text generationCode

Non-sovereign

Cohere Command R+ Command R+ for enterprise RAG. Inline citation generation.

104B

128K

Proprietary

Text generation

Non-sovereign

Coding

Code Llama 70B Code Llama 70B. Fill-in-middle, repo-scale completion.

70B

100K

Llama Community

CodeText generation

Sovereign · AU

Qwen 3.5 397B ★ Flagship Qwen 3.5 MoE. Tops GPQA Diamond among open weights.

397B / 17B active

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Qwen3-Coder 480B ★ Best-in-class open coding model. Repo-scale agent workflows.

480B / 35B active

Apache 2.0

CodeText generation

Sovereign · AU

Qwen3-Coder 30B Workhorse coding model for IDE-integrated assistants.

30B

256K

Apache 2.0

Code

Sovereign · AU

DeepSeek V4 Pro ★ Frontier open-weight. 80.6 SWE-Bench Verified, 90.1 GPQA Diamond.

1.6T / 49B active

MIT

Text generationReasoning

Sovereign · AU

DeepSeek V4 Flash Low-latency DeepSeek V4. Most of the smarts at a fraction of the cost.

37B / 6B active

256K

MIT

Text generationCode

Sovereign · AU

DeepSeek V3.5 Production workhorse before V4 Pro. Excellent code generation.

671B / 37B active

128K

MIT

Text generationCode

Sovereign · AU

DeepSeek-Coder V3 Specialist coding DeepSeek. Repo-scale, fill-in-middle.

236B / 21B active

128K

MIT

Code

Sovereign · AU

Codestral 22B Mistral's coding specialist. Multi-language, fill-in-middle.

22B

32K

Apache 2.0

Code

Sovereign · AU

Codestral Mamba 7B Mamba-architecture coding model. Linear-time long-context inference.

256K

Apache 2.0

Code

Sovereign · AU

CodeGemma 7B Code-specialist Gemma. Fast, fill-in-middle, IDE-friendly.

Gemma

Code

Sovereign · AU

Gemini 2.5 Pro Gemini 2.5 Pro via Google API. 2M context, strong vision + reasoning.

Proprietary

Text generationVision

Non-sovereign

Claude Opus 4.6 ★ Anthropic's frontier model. Industry-leading on agentic tasks.

500K

Proprietary

Text generationReasoning

Non-sovereign

Claude Sonnet 4.6 Production workhorse Claude. Best quality-per-dollar for most workloads.

500K

Proprietary

Text generationReasoning

Non-sovereign

GPT-5 ★ OpenAI's flagship. Top of leaderboards on most general benchmarks.

400K

Proprietary

Text generationReasoning

Non-sovereign

GPT-4.1 GPT-4.1 with 1M context. Strong for long-document workflows.

Proprietary

Text generationVision

Non-sovereign

Kimi K2.6 ★ Kimi K2.6, 90.5% GPQA. Strong agentic and coding capabilities.

1T / 32B active

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Kimi K2 Previous Kimi flagship. Production-stable for coding agents.

1T / 32B active

200K

Apache 2.0

Text generationCode

Sovereign · AU

GLM 5.1 GLM 5.1, strong open coding + agentic performance.

356B / 32B active

128K

Apache 2.0

Text generationCode

Sovereign · AU

Reasoning

Llama 4 Maverick ★ Flagship Llama 4 MoE. 400B total / 17B active. Vision + 1M context.

400B / 17B active

Llama Community

Text generationVision

Sovereign · AU

Llama 3.1 405B Instruct Dense 405B Llama for heavyweight reasoning workloads.

405B

128K

Llama Community

Text generationReasoning

Sovereign · AU

Qwen 3.5 397B ★ Flagship Qwen 3.5 MoE. Tops GPQA Diamond among open weights.

397B / 17B active

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Qwen 3.5 72B Instruct Dense 72B Qwen. Strong reasoning, multilingual.

72B

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

QwQ 32B Reasoning Chain-of-thought reasoning Qwen. Strong on maths + logic.

32B

128K

Apache 2.0

ReasoningText generation

Sovereign · AU

Qwen2.5-Math 72B Math-specialist Qwen. Symbolic reasoning, proofs, MATH benchmark.

72B

128K

Apache 2.0

ReasoningText generation

Sovereign · AU

DeepSeek V4 Pro ★ Frontier open-weight. 80.6 SWE-Bench Verified, 90.1 GPQA Diamond.

1.6T / 49B active

MIT

Text generationReasoning

Sovereign · AU

DeepSeek R1 ★ Open reasoning model. Visible chain-of-thought traces.

671B / 37B active

128K

MIT

ReasoningText generation

Sovereign · AU

DeepSeek R1 Distill Llama 70B R1 reasoning distilled into Llama 70B. Production-friendly latency.

70B

128K

MIT

ReasoningText generation

Sovereign · AU

DeepSeek R1 Distill Qwen 32B R1 reasoning distilled into Qwen 32B. Strong cost-performance.

32B

128K

MIT

ReasoningText generation

Sovereign · AU

DeepSeek-Math 7B Math-specialist DeepSeek. Compact, fast for symbolic work.

32K

MIT

Reasoning

Sovereign · AU

Mistral Large 3 ★ Mistral's flagship open MoE. Strong multilingual, function-calling.

675B / 41B active

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Gemma 4 31B ★ Largest open Gemma 4. Strong reasoning, function-calling.

31B

256K

Gemma

Text generationReasoning

Sovereign · AU

Gemini 2.5 Pro Gemini 2.5 Pro via Google API. 2M context, strong vision + reasoning.

Proprietary

Text generationVision

Non-sovereign

Gemini 3.1 Pro Preview Google's next-gen Gemini in preview. Reasoning + visual leadership.

Proprietary

Text generationVision

Non-sovereign

Claude Opus 4.6 ★ Anthropic's frontier model. Industry-leading on agentic tasks.

500K

Proprietary

Text generationReasoning

Non-sovereign

GPT-5 ★ OpenAI's flagship. Top of leaderboards on most general benchmarks.

400K

Proprietary

Text generationReasoning

Non-sovereign

o4 Reasoning OpenAI's reasoning series. Visible chain-of-thought traces.

200K

Proprietary

ReasoningText generation

Non-sovereign

o4 mini Compact o4 reasoning. Faster, cheaper for routine reasoning workloads.

200K

Proprietary

Reasoning

Non-sovereign

Kimi K2.6 ★ Kimi K2.6, 90.5% GPQA. Strong agentic and coding capabilities.

1T / 32B active

256K

Apache 2.0

Text generationReasoning

Sovereign · AU

Grok 4.1 xAI's frontier model. Real-time data, strong reasoning.

256K

Proprietary

Text generationReasoning

Non-sovereign

Grok 4 Grok 4 production model. Reasoning + tool use.

256K

Proprietary

Text generationReasoning

Non-sovereign

Yi-Large 405B Yi-Large flagship. Strong multilingual reasoning.

405B

200K

Apache 2.0

Text generationReasoning

Sovereign · AU

Phi 4 14B Phi 4, small-model reasoning leader. MIT licence.

14B

128K

MIT

Text generationReasoning

Sovereign · AU

Document analysis

Llama 4 Maverick ★ Flagship Llama 4 MoE. 400B total / 17B active. Vision + 1M context.

400B / 17B active

Llama Community

Text generationVision

Sovereign · AU

Llama 4 Scout 10M context window. Built for long-document RAG and summarisation.

109B / 17B active

10M

Llama Community

Text generationVision

Sovereign · AU

Llama 3.2 11B Vision Compact vision-capable Llama. Image + text understanding.

11B

128K

Llama Community

Text generationVision

Sovereign · AU

Llama 3.2 90B Vision Large-format vision Llama for high-fidelity image reasoning.

90B

128K

Llama Community

Text generationVision

Sovereign · AU

Llama 3.1 405B Instruct Dense 405B Llama for heavyweight reasoning workloads.

405B

128K

Llama Community

Text generationReasoning

Sovereign · AU

Qwen2.5-VL 72B Vision-language Qwen. Document, chart, screen understanding.

72B

128K

Apache 2.0

VisionText generation

Sovereign · AU

DeepSeek-VL2 DeepSeek vision-language. OCR + chart + screen reasoning.

27B / 4.5B active

128K

MIT

VisionText generation

Sovereign · AU

Pixtral 12B Mistral's open vision-language model. Document + UI reasoning.

12B

128K

Apache 2.0

VisionText generation

Sovereign · AU

Pixtral Large 124B Heavyweight Pixtral. State-of-the-art open multimodal.

124B

128K

Apache 2.0

VisionText generation

Sovereign · AU

Gemini 2.5 Pro Gemini 2.5 Pro via Google API. 2M context, strong vision + reasoning.

Proprietary

Text generationVision

Non-sovereign

Claude Opus 4.6 ★ Anthropic's frontier model. Industry-leading on agentic tasks.

500K

Proprietary

Text generationReasoning

Non-sovereign

GPT-4.1 GPT-4.1 with 1M context. Strong for long-document workflows.

Proprietary

Text generationVision

Non-sovereign

RAG & retrieval

Llama 4 Scout 10M context window. Built for long-document RAG and summarisation.

109B / 17B active

10M

Llama Community

Text generationVision

Sovereign · AU

Cohere Command R+ Command R+ for enterprise RAG. Inline citation generation.

104B

128K

Proprietary

Text generation

Non-sovereign

Cohere Command R Smaller Command R. Cost-efficient RAG-tuned inference.

35B

128K

Proprietary

Text generation

Non-sovereign

BGE Large EN v1.5 BGE Large, top open English embedding model.

335M

512

MIT

Embeddings

Sovereign · AU

BGE M3 Multilingual + multifunctional. Dense, sparse, multi-vector retrieval.

560M

MIT

Embeddings

Sovereign · AU

BGE Reranker v2 M3 Multilingual cross-encoder reranker. Drop-in RAG quality boost.

560M

MIT

Rerank

Sovereign · AU

Snowflake Arctic Embed L Snowflake Arctic Embed, strong on retrieval benchmarks.

335M

512

Apache 2.0

Embeddings

Sovereign · AU

Snowflake Arctic Embed M v2 Cost-efficient Snowflake embed. Long-context variant.

305M

Apache 2.0

Embeddings

Sovereign · AU

Nomic Embed Text v1.5 Nomic Embed, open weights, Matryoshka representation.

137M

Apache 2.0

Embeddings

Sovereign · AU

Nomic Embed Vision v1.5 Multimodal embeddings. Joint text + image space.

92M

Apache 2.0

Embeddings

Sovereign · AU

Jina Embeddings v3 Jina v3, multilingual, task-specific LoRA adapters.

570M

Apache 2.0

Embeddings

Sovereign · AU

Jina Reranker v2 Jina cross-encoder reranker. Multilingual.

278M

Apache 2.0

Rerank

Sovereign · AU

Cohere Embed v4 Cohere Embed v4. Multimodal, multilingual, image + text retrieval.

Proprietary

Embeddings

Non-sovereign

Cohere Rerank v3.5 Cohere Rerank, enterprise gold standard for RAG quality.

Proprietary

Rerank

Non-sovereign

Sonar Pro Online Web-grounded Sonar. Real-time citations, fresh information retrieval.

200K

Proprietary

Text generation

Non-sovereign

Sonar Online Cost-efficient Sonar with live web context.

128K

Proprietary

Text generation

Non-sovereign

Image work

Llama 3.2 11B Vision Compact vision-capable Llama. Image + text understanding.

11B

128K

Llama Community

Text generationVision

Sovereign · AU

Llama 3.2 90B Vision Large-format vision Llama for high-fidelity image reasoning.

90B

128K

Llama Community

Text generationVision

Sovereign · AU

Qwen2.5-VL 72B Vision-language Qwen. Document, chart, screen understanding.

72B

128K

Apache 2.0

VisionText generation

Sovereign · AU

Qwen2.5-VL 7B Compact Qwen-VL for high-throughput vision pipelines.

128K

Apache 2.0

VisionText generation

Sovereign · AU

DeepSeek-VL2 DeepSeek vision-language. OCR + chart + screen reasoning.

27B / 4.5B active

128K

MIT

VisionText generation

Sovereign · AU

Pixtral 12B Mistral's open vision-language model. Document + UI reasoning.

12B

128K

Apache 2.0

VisionText generation

Sovereign · AU

Pixtral Large 124B Heavyweight Pixtral. State-of-the-art open multimodal.

124B

128K

Apache 2.0

VisionText generation

Sovereign · AU

PaliGemma 2 28B Google's open VLM line. Document, screen, chart understanding.

28B

Gemma

VisionText generation

Sovereign · AU

Yi-VL 34B Yi's vision-language model. Strong on chart + screen understanding.

34B

128K

Apache 2.0

VisionText generation

Sovereign · AU

Phi 3.5 Vision Compact Phi vision model. Strong document + chart OCR.

4.2B

128K

MIT

VisionText generation

Sovereign · AU

Stable Diffusion 3.5 Large SD 3.5 Large. Photoreal output, strong prompt adherence.

Proprietary

Image generation

Sovereign · AU

Stable Diffusion 3.5 Medium SD 3.5 Medium. Balanced quality vs throughput.

2.5B

Proprietary

Image generation

Sovereign · AU

SDXL 1.0 SDXL 1.0, stable, ecosystem-rich. ControlNet + LoRA support.

3.5B

Proprietary

Image generation

Sovereign · AU

FLUX.1 [pro] ★ FLUX.1 [pro], top-of-class image generation.

12B

Proprietary

Image generation

Sovereign · AU

FLUX.1 [dev] Open FLUX dev. Non-commercial fine-tuning permitted.

12B

Apache 2.0

Image generation

Sovereign · AU

FLUX.1 [schnell] Fast 4-step FLUX. Real-time interactive generation.

12B

Apache 2.0

Image generation

Sovereign · AU

Nomic Embed Vision v1.5 Multimodal embeddings. Joint text + image space.

92M

Apache 2.0

Embeddings

Sovereign · AU

Video work

Stable Video Diffusion 1.1 Image-to-video diffusion. Short clip generation from a still.

1.5B

Proprietary

Video

Sovereign · AU

Voice

Whisper Large v3 Whisper Large v3. Multilingual transcription, gold standard.

1.55B

MIT

Speech-to-text

Sovereign · AU

Whisper Large v3 Turbo Faster Whisper Large. ~5x throughput at near-identical accuracy.

809M

MIT

Speech-to-text

Sovereign · AU

Distil-Whisper Large v3 Distilled Whisper. 6x faster, English-only.

756M

MIT

Speech-to-text

Sovereign · AU

ElevenLabs v3 ElevenLabs v3 voice synthesis. Multi-speaker, emotion-aware.

Proprietary

Text-to-speech

Non-sovereign

ElevenLabs Multilingual v2 29 languages, voice cloning. Production-stable.

Proprietary

Text-to-speech

Non-sovereign

Safety

Llama Guard 3 8B Content-safety classifier from Meta. Pre/post-filter for chat.

128K

Llama Community

Safety / moderationText generation

Sovereign · AU

Frequently asked questions

What does 'sovereign' mean on these model cards?

Sovereign models run on Amaze GPU pools in Sydney. Weights, prompts, responses and embeddings stay in Australia. Non-sovereign models are provider-hosted catalogue entries (Anthropic Claude, OpenAI GPT, Google Gemini) accessed through the Amaze control plane, same authentication, same billing entity, but inference traffic transits the provider's region. The Non-sovereign tag is there so you can opt in or out per workload.

Do you actually have over 100 models?

Yes. As of today the catalogue lists 104 models. The split is roughly two-thirds sovereign open-weight (Llama, Qwen, DeepSeek, Mistral, Gemma, Kimi, Phi, Yi, GLM, Whisper, FLUX, Stable Diffusion, BGE / Jina / Nomic / Snowflake embeddings) and one-third non-sovereign frontier models (Claude, GPT, Gemini, Grok, Perplexity Sonar, Cohere). The list updates as model providers ship.

How do I pick a model?

Three ways. By capability if you know the task, chat, code, reasoning, vision, embedding, etc. By family if you've already picked a provider, useful for migrations from another inference platform. By use case if you're scoping a workload, 'chatbot', 'RAG', 'voice agent'. The same model usually appears in multiple groupings.

Can I fine-tune any of these models?

All sovereign open-weight models can be fine-tuned on Amaze GPU pools. Non-sovereign frontier models follow the provider's fine-tuning policy. Talk to sales about specific weight access and deployment topology; many customers run their fine-tunes on dedicated AU GPU pools.

Bring your workload.
We'll find the model.

Talk to a solution architect about which models fit your latency, accuracy, sovereignty and cost envelope.

Contact Sales →

100+ Models. 100% Australian Compute.

Every model. One Australian API.

Frequently asked questions

Bring your workload. We'll find the model.

Bring your workload.
We'll find the model.