A comprehensive reference of all major AI models across text, image, video, coding, and music — with specs, context windows, providers, and key strengths. Updated for 2026.
| Model | Provider | Context Window | Release | Key Strength |
|---|---|---|---|---|
| GPT-4o | OpenAI | 128K | May 2024 | Best multimodal model — vision, voice, reasoning |
| GPT-4o mini | OpenAI | 128K | Jul 2024 | Fast, affordable, strong quality-to-cost |
| o1 | OpenAI | 200K | Sep 2024 | Advanced multi-step reasoning and math |
| o3-mini | OpenAI | 200K | Feb 2025 | Efficient reasoning at low cost |
| Claude 3.5 Sonnet | Anthropic | 200K | Jun 2024 | Best coding and long-document analysis |
| Claude 3.5 Haiku | Anthropic | 200K | Nov 2024 | Fastest Anthropic model, low cost |
| Claude 3 Opus | Anthropic | 200K | Mar 2024 | Most powerful Claude 3 for complex tasks |
| Gemini 1.5 Pro | 2M tokens | Apr 2024 | Largest context window — entire codebases | |
| Gemini 2.0 Flash | 1M tokens | Feb 2025 | Fast, affordable, free tier available | |
| Gemini 2.5 Pro | 1M tokens | May 2025 | Google's flagship reasoning model | |
| Llama 3.1 405B | Meta (open) | 128K | Jul 2024 | Largest open-weight model, self-hostable |
| Mistral Large 2 | Mistral AI | 128K | Jul 2024 | Top European LLM, strong multilingual |
| Grok 2 | xAI | 131K | Aug 2024 | Real-time X/Twitter data, uncensored |
| Model | Provider | Type | Free Tier | Best For |
|---|---|---|---|---|
| DALL-E 3 | OpenAI | Diffusion | No (ChatGPT Plus) | Accurate prompt following, complex scenes |
| Midjourney v6 | Midjourney | Diffusion | No ($10+/mo) | Photorealistic & artistic quality |
| Stable Diffusion 3 | Stability AI | Diffusion | Yes (self-hosted) | Fine-tuning, local generation, custom styles |
| Flux 1.1 Pro | Black Forest Labs | Flow Matching | No (API) | Open-weight quality, photorealism |
| Ideogram 2.0 | Ideogram | Diffusion | Limited free tier | Text in images, typography |
| Adobe Firefly | Adobe | Diffusion | Limited credits | Commercially safe, stock-trained images |
| Gemini Imagen | Diffusion | Via Gemini app | Integrated Google Workspace generation | |
| Leonardo AI | Leonardo.ai | Diffusion | 150 credits/day | Game assets, consistent character design |
State-of-the-art realism, cinematic motion, up to 1 min. Access via VideoFX.
Professional video editor integration, fine motion control, 10-second clips.
Up to 3-min videos, strong Chinese market adoption, competitive quality.
Easy-to-use, strong for short social-media clips. Features effects & scene modifications.
High temporal consistency, up to 1 minute. Available to ChatGPT Pro subscribers.
Agentic terminal tool using Claude 3.5 Sonnet. Full codebase editing, shell commands, git.
VS Code fork with Claude/GPT-4o integration. Best in-editor AI coding experience.
Deeply integrated in VS Code, JetBrains, Neovim. $10/mo for individuals.
Cloud-based agentic coding tasks. Runs in sandboxed environment, handles multi-file edits.
Text-to-song with vocals, instruments, and mastering. Best for complete song generation.
High-fidelity music generation, strong genre diversity, stems download available.
Open-source music generation and audio synthesis. Free to run locally.
GPT-4o and Claude 3.5 Sonnet lead for general use. GPT-4o excels at multimodal tasks; Claude 3.5 Sonnet is best for coding and long documents with 200K context.
Midjourney v6 produces the highest-quality images. DALL-E 3 is best for accurate prompt interpretation. Flux 1.1 Pro is the top open-weight option.
Gemini 2.0 Flash ($0.0001/1K input, free tier) and GPT-4o mini ($0.00015/1K) offer the best quality-to-cost ratio. Llama 3.1 can be self-hosted for free.
Veo 2, Runway Gen-3 Alpha, Kling AI 2.0, Pika 2.0, and Sora are the leading video generation models in 2026.