200+ generative AI models

Build using open-source and specialized multimodal models for chat, images, code, and more.

Seamlessly migrate from closed-source solutions with OpenAI-compatible APIs.

LLaMA-2 Chat (7B)

Chat-optimized LLM leveraging public datasets and 1M+ human annotations.

Try this model
LLaMA-2 Chat (13B)

Chat-optimized LLM leveraging public datasets and 1M+ human annotations.

Try this model
Llama 3 8B Instruct Reference

Auto-regressive LLM with optimized transformers, SFT, and RLHF for alignment with helpfulness and safety preferences.

Try this model
Llama 3 8B Instruct Lite

Auto-regressive LLM with optimized transformers, SFT, and RLHF for alignment with helpfulness and safety preferences.

Try this model
Llama 3 70B Instruct Turbo

Auto-regressive LLM with optimized transformers, SFT, and RLHF for alignment with helpfulness and safety preferences.

Try this model
Llama 3 70B Instruct Reference

Auto-regressive LLM with optimized transformers, SFT, and RLHF for alignment with helpfulness and safety preferences.

Try this model
Gryphe MythoMax L2 Lite (13B)

Experimental merge of MythoLogic-L2 and Huginn using tensor intermingling for enhanced front and end tensor integration.

Try this model
MythoMax-L2

Experimental merge of MythoLogic-L2 and Huginn using tensor intermingling for enhanced front and end tensor integration.

Try this model
Typhoon 2 8B Instruct

Instruct Thai large language model with 8 billion parameters based on Llama3.1-8B.

Try this model
Typhoon 2 70B Instruct

Instruct Thai large language model with 70 billion parameters, based on Llama3.1-70B.

Try this model
Qwen 2

Transformer-based decoder-only LLM, pretrained on extensive data, offering improvements over the previous Qwen model.

Try this model
Nous Hermes 2 - Mixtral 8x7B-DPO

Flagship Nous Research MoE model trained on 1M+ GPT-4 and high-quality open dataset entries, excelling across diverse tasks.

Try this model
Mixtral-8x22B Instruct v0.1

Instruct fine-tuned version of Mixtral-8x22B-v0.1.

Try this model
Llama 3.1 70B

Multilingual LLM pre-trained and instruction-tuned, surpassing open and closed models on key benchmarks.

Try this model
Mistral (7B) Instruct v0.3

Instruct fine-tuned version of the Mistral-7B-v0.3.

Try this model
Mistral (7B) Instruct v0.2

Improved instruct fine-tuned version of Mistral-7B-Instruct-v0.1.

Try this model
Mixtral 8x7B Instruct v0.1

Pretrained generative Sparse Mixture of Experts.

Try this model
Mistral Instruct

Instruct fine-tuned version of Mistral-7B-v0.1

Try this model
Qwen2.5 7B Instruct Turbo

Instruction-tuned 7.61B Qwen2.5 causal LLM with 131K context, RoPE, SwiGLU, RMSNorm, and advanced attention mechanisms.

Try this model
Gemma-2 Instruct (9B)

Lightweight, SOTA open models from Google, leveraging research and tech behind the Gemini models.

Try this model
Gemma Instruct (2B)

2B instruct Gemma model by Google: lightweight, open, text-to-text LLM for QA, summarization, reasoning, and resource-efficient deployment.

Try this model
Llama 3.2 3B Instruct Turbo

Multimodal LLM optimized for visual recognition, image reasoning, captioning, and answering image-related questions.

Try this model
DBRX-Instruct

MoE LLM trained from scratch and specialized in few-turn interactions for enhanced performance.

Try this model
Gemma 3 12B

Lightweight Gemma 3 model with 128K context, vision-language input, and multilingual support for on-device AI.

Try this model
Gemma 3 4B

Lightweight Gemma 3 model (1B) with 128K context, vision-language input, and multilingual support for on-device AI.

Try this model
Gemma 3 1B

Most lightweight Gemma 3 model, with 128K context, vision-language input, and multilingual support for on-device AI.

Try this model
DeepSeek R1 Distilled Qwen 1.5B

Small Qwen 1.5B distilled with reasoning capabilities from Deepseek R1. Beats GPT-4o on MATH-500 whilst being a fraction of the size.

Try this model
DeepSeek R1 Distilled Qwen 14B

Qwen 14B distilled with reasoning capabilities from Deepseek R1. Outperforms GPT-4o in math & matches o1-mini on coding.

Try this model
DeepSeek R1 Distilled Llama 70B

Llama 70B distilled with reasoning capabilities from Deepseek R1. Surpasses GPT-4o with 94.5% on MATH-500 & matches o1-mini on coding.

Try this model
Llama 3.1 8B

Multilingual LLM pre-trained and instruction-tuned, surpassing open and closed models on key benchmarks.

Try this model
Gemma-2 Instruct (27B)

Lightweight, SOTA open models from Google, leveraging research and tech behind the Gemini models.

Try this model
Llama 3.1 405B

Multilingual LLM pre-trained and instruction-tuned, surpassing open and closed models on key benchmarks.

Try this model
Llama 3.3 70B

70B multilingual LLM, pretrained and instruction-tuned, excels in dialogue use cases, surpassing open and closed models.

Try this model
Cogito V1 Preview Llama 3B

Best-in-class open-source LLM trained with IDA for alignment, reasoning, and self-reflective, agentic applications.

Try this model
Llama 3.1 Nemotron 70B Instruct

Custom NVIDIA LLM optimized to enhance the helpfulness and relevance of generated responses to user queries.

Try this model
Cogito V1 Preview Llama 8B

Best-in-class open-source LLM trained with IDA for alignment, reasoning, and self-reflective, agentic applications.

Try this model
Mistral Small 3

24B model rivaling GPT-4o mini, and larger models like Llama 3.3 70B. Ideal for chat use cases like customer support, translation and summarization.

Try this model
Cogito V1 Preview Qwen 14B

Best-in-class open-source LLM trained with IDA for alignment, reasoning, and self-reflective, agentic applications.

Try this model
Qwen2.5 72B

Decoder-only model built for advanced language processing tasks.

Try this model
Cogito V1 Preview Qwen 32B

Best-in-class open-source LLM trained with IDA for alignment, reasoning, and self-reflective, agentic applications.

Try this model
Qwen QwQ-32B

Qwen series reasoning model excelling in complex tasks, outperforming conventional instruction-tuned models on hard problems.

Try this model
DeepSeek-V3-0324

Mixture-of-Experts model challenging top AI models at much lower cost. Updated on March 24th, 2025.

Try this model
Cogito V1 Preview Llama 70B

Best-in-class open-source LLM trained with IDA for alignment, reasoning, and self-reflective, agentic applications.

Try this model
Qwen2.5-VL 72B Instruct

Vision-language model with advanced visual reasoning, video understanding, structured outputs, and agentic capabilities.

Try this model
Llama 3.3 70B Instruct Turbo Free

Free endpoint to try this 70B multilingual LLM optimized for dialogue, excelling in benchmarks and surpassing many chat models.

Try this model
Gemma 3 27B

Lightweight model with vision-language input, multilingual support, visual reasoning, and top-tier performance per size.

Try this model
DeepSeek R1 Distilled Llama 70B Free

Free endpoint to experiment the power of reasoning models. This distilled model beats GPT-4o on math & matches o1-mini on coding.

Try this model
DeepSeek R1

Open-source reasoning model rivaling OpenAI-o1, excelling in math, code, reasoning, and cost efficiency.

Try this model
Llama 4 Scout

SOTA 109B model with 17B active params & large context, excelling at multi-document analysis, codebase reasoning, and personalized tasks.

Try this model
Llama 4 Maverick

SOTA 128-expert MoE powerhouse for multilingual image/text understanding, creative writing, and enterprise-scale applications.

Try this model
FLUX.1 Schnell [fixedres]

12 billion parameter rectified flow transformer capable of generating images from text descriptions.

Try this model
FLUX.1 [schnell]

Fastest available endpoint for the SOTA open-source image generation model by Black Forest Labs.

Try this model
FLUX.1 Redux [dev]

Adapter for FLUX.1 models enabling image variation, refining input images, and integrating into advanced restyling workflows.

Try this model
FLUX.1 Depth [dev]

12 billion parameter rectified flow transformer capable of generating an image based on a text description while following the structure of a given input image.

Try this model
FLUX.1 Canny [dev]

12 billion parameter rectified flow transformer capable of generating an image based on a text description while following the structure of a given input image.

Try this model
FLUX.1 [pro]

First generation premium image generation model by Black Forest Labs.

Try this model
FLUX.1 [dev]

12 billion parameter rectified flow transformer capable of generating images from text descriptions.

Try this model
FLUX1.1 [pro]

Advanced image generation model with FLUX.1-dev architecture, offering high-quality outputs for artistic and commercial use.

Try this model
FLUX.1 [schnell] Free

Free endpoint for the SOTA open-source image generation model by Black Forest Labs.

Try this model
Llama 3.2 11B

Multimodal LLM optimized for visual recognition, image reasoning, captioning, and answering image-related questions.

Try this model
Llama 3.2 90B

Multimodal LLM optimized for visual recognition, image reasoning, captioning, and answering image-related questions.

Try this model
Gemma 3 12B

Lightweight Gemma 3 model with 128K context, vision-language input, and multilingual support for on-device AI.

Try this model
Gemma 3 4B

Lightweight Gemma 3 model (1B) with 128K context, vision-language input, and multilingual support for on-device AI.

Try this model
Gemma 3 1B

Most lightweight Gemma 3 model, with 128K context, vision-language input, and multilingual support for on-device AI.

Try this model
Qwen2-VL-72B-Instruct

OSS vision model merging advanced vision with instruction-tuned language understanding for visual reasoning.

Try this model
Llama 3.2 11B Free

Free endpoint to test this auto-regressive language model that uses an optimized transformer architecture.

Try this model
Qwen2.5-VL 72B Instruct

Vision-language model with advanced visual reasoning, video understanding, structured outputs, and agentic capabilities.

Try this model
Gemma 3 27B

Lightweight model with vision-language input, multilingual support, visual reasoning, and top-tier performance per size.

Try this model
Llama 4 Scout

SOTA 109B model with 17B active params & large context, excelling at multi-document analysis, codebase reasoning, and personalized tasks.

Try this model
Llama 4 Maverick

SOTA 128-expert MoE powerhouse for multilingual image/text understanding, creative writing, and enterprise-scale applications.

Try this model
Cartesia Sonic-2

Low-latency, ultra-realistic voice model, served in partnership with Cartesia.

Try this model
LLaMA-2

LLM trained on 2T tokens with double Llama 1's context length, available in 7B, 13B, and 70B parameter sizes.

Try this model
Llama Guard 3 8B

8B Llama 3.1 model fine-tuned for content safety, moderating prompts and responses in 8 languages with MLCommons alignment.

Try this model
Llama Guard 3 11B Vision Turbo

11B Llama 3.2 model fine-tuned for content safety, detecting harmful multimodal prompts and text in image reasoning use cases.

Try this model
Llama Guard 2 8B

8B Llama 3-based safeguard model for classifying LLM inputs and outputs, detecting unsafe content and policy violations.

Try this model
Llama Guard (7B)

7B Llama 2-based safeguard model for classifying LLM inputs and outputs, detecting unsafe content and policy violations.

Try this model
Mixtral 8x7B v0.1

Pretrained generative Sparse Mixture of Experts.

Try this model
Mistral

7.3B model surpassing Llama 2 13B, nearing CodeLlama 7B on code, with GQA for speed and SWA for efficient long-sequence handling.

Try this model
Gemma 3 12B

Lightweight Gemma 3 model with 128K context, vision-language input, and multilingual support for on-device AI.

Try this model
Gemma 3 4B

Lightweight Gemma 3 model (1B) with 128K context, vision-language input, and multilingual support for on-device AI.

Try this model
Gemma 3 1B

Most lightweight Gemma 3 model, with 128K context, vision-language input, and multilingual support for on-device AI.

Try this model
Qwen 2.5 Coder 32B Instruct

SOTA code LLM with advanced code generation, reasoning, fixing, and support for up to 128K tokens.

Try this model
Cogito V1 Preview Llama 3B

Best-in-class open-source LLM trained with IDA for alignment, reasoning, and self-reflective, agentic applications.

Try this model
Cogito V1 Preview Llama 8B

Best-in-class open-source LLM trained with IDA for alignment, reasoning, and self-reflective, agentic applications.

Try this model
Cogito V1 Preview Qwen 14B

Best-in-class open-source LLM trained with IDA for alignment, reasoning, and self-reflective, agentic applications.

Try this model
Cogito V1 Preview Qwen 32B

Best-in-class open-source LLM trained with IDA for alignment, reasoning, and self-reflective, agentic applications.

Try this model
Cogito V1 Preview Llama 70B

Best-in-class open-source LLM trained with IDA for alignment, reasoning, and self-reflective, agentic applications.

Try this model
Gemma 3 27B

Lightweight model with vision-language input, multilingual support, visual reasoning, and top-tier performance per size.

Try this model
M2-BERT 80M 32K Retrieval

80M checkpoint of M2-BERT, pretrained with sequence length 32768, and it has been fine-tuned for long-context retrieval.

Try this model
BGE-Base-EN v1.5

This model maps any text to a low-dimensional dense vector using FlagEmbedding.

Try this model
M2-BERT 80M 8K Retrieval

80M checkpoint of M2-BERT, pretrained with sequence length 8192, and it has been fine-tuned for long-context retrieval.

Try this model
M2-BERT 80M 2K Retrieval

80M checkpoint of M2-BERT, pretrained with sequence length 2048, and it has been fine-tuned for long-context retrieval.

Try this model
UAE-Large v1

Universal English sentence embedding model by WhereIsAI with 1024-dim embeddings and 512 context length support.

Try this model
BGE-Large-EN v1.5

BAAI v1.5 large maps text to dense vectors for retrieval, classification, clustering, semantic search, and LLM databases.

Try this model
Salesforce LlamaRank

Salesforce Research's proprietary fine-tuned rerank model with 8K context, outperforming Cohere Rerank for superior document retrieval.

Try this model

Let's stay in touch.

Get Contact
cta-area