Model Selection
Choose the right AI model for your agent
Choose the right AI model for your agent based on capabilities, speed, and cost.
Supported Providers
ArcanFlows supports multiple AI providers, giving you flexibility to choose the best model for your use case.
OpenAI
| Model | Context | Best For | Speed | Cost |
|---|---|---|---|---|
| GPT-4 Turbo | 128K | Complex reasoning, nuanced responses | Medium | $$$$ |
| GPT-4 | 8K | High-quality responses, function calling | Medium | $$$$ |
| GPT-3.5 Turbo | 16K | General purpose, fast responses | Fast | $$ |
Anthropic
| Model | Context | Best For | Speed | Cost |
|---|---|---|---|---|
| Claude 3 Opus | 200K | Complex analysis, long documents | Slow | $$$$$ |
| Claude 3 Sonnet | 200K | Balanced performance | Medium | $$$ |
| Claude 3 Haiku | 200K | Fast responses, simple tasks | Fast | $ |
| Model | Context | Best For | Speed | Cost |
|---|---|---|---|---|
| Gemini Ultra | 32K | Multimodal, complex tasks | Medium | $$$$ |
| Gemini Pro | 32K | General purpose | Fast | $$ |
Ollama (Local)
| Model | Context | Best For | Speed | Cost |
|---|---|---|---|---|
| Llama 3 70B | 8K | High quality, privacy | Varies | Free |
| Llama 3 8B | 8K | Fast, resource efficient | Fast | Free |
| Mistral 7B | 8K | Lightweight, efficient | Fast | Free |
| Mixtral 8x7B | 32K | Good balance | Medium | Free |
Choosing the Right Model
Decision Framework
┌─────────────────┐
│ What's your │
│ priority? │
└────────┬────────┘
┌─────────────────┼─────────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Speed │ │ Quality │ │ Privacy │
└────┬─────┘ └────┬─────┘ └────┬─────┘
▼ ▼ ▼
GPT-3.5 Turbo GPT-4 Turbo Ollama
Claude Haiku Claude Opus (Local)
Gemini Pro
By Use Case
| Use Case | Recommended Model | Why |
|---|---|---|
| Customer Support | GPT-3.5 Turbo | Fast, cost-effective, good quality |
| Legal/Medical | GPT-4 Turbo | Accuracy is critical |
| Document Analysis | Claude 3 Opus | 200K context for long docs |
| Chatbot | Claude 3 Haiku | Fast, conversational |
| Internal Tools | Ollama | Privacy, no API costs |
| Code Assistance | GPT-4 Turbo | Best code understanding |
By Budget
| Budget Level | Recommended Models |
|---|---|
| Minimal | GPT-3.5 Turbo, Claude Haiku, Ollama |
| Moderate | GPT-4, Claude Sonnet, Gemini Pro |
| Enterprise | GPT-4 Turbo, Claude Opus |
Model Parameters
Temperature
Controls randomness in responses:
| Value | Behavior | Use When |
|---|---|---|
| 0.0 | Deterministic, consistent | Facts, data retrieval |
| 0.3 | Mostly consistent | Customer support |
| 0.7 | Balanced (default) | General conversation |
| 1.0 | Creative, varied | Brainstorming, content |
Max Tokens
Limits response length:
javascript// Short responses (quick answers) max_tokens: 256 // Medium responses (explanations) max_tokens: 1024 // Long responses (detailed analysis) max_tokens: 4096
Top P (Nucleus Sampling)
Controls diversity:
| Value | Effect |
|---|---|
| 0.1 | Very focused, predictable |
| 0.5 | Moderate diversity |
| 0.9 | High diversity |
| 1.0 | Consider all options (default) |
Provider Configuration
OpenAI Setup
- Get your API key from platform.openai.com
- Go to Settings > AI Providers
- Click Add Provider > OpenAI
- Enter your API key
json{ "provider": "openai", "api_key": "sk-...", "organization_id": "org-..." // optional }
Anthropic Setup
- Get your API key from console.anthropic.com
- Go to Settings > AI Providers
- Click Add Provider > Anthropic
- Enter your API key
Ollama Setup
For local models:
- Install Ollama on your server
- Pull models:
ollama pull llama3 - Configure base URL in ArcanFlows
json{ "provider": "ollama", "base_url": "http://your-server:11434", "model": "llama3:70b" }
Cost Optimization
Tips to Reduce Costs
-
Start with cheaper models
- Use GPT-3.5 for simple tasks
- Upgrade only when needed
-
Optimize prompts
- Shorter prompts = lower costs
- Remove unnecessary instructions
-
Limit response length
- Set appropriate max_tokens
- Ask for concise responses in prompt
-
Cache responses
- Enable caching for repeated queries
- Reduces API calls
-
Use local models
- Ollama for development
- Ollama for privacy-sensitive data
Cost Comparison (approximate per 1M tokens)
| Model | Input | Output |
|---|---|---|
| GPT-3.5 Turbo | $0.50 | $1.50 |
| GPT-4 Turbo | $10 | $30 |
| Claude 3 Haiku | $0.25 | $1.25 |
| Claude 3 Sonnet | $3 | $15 |
| Claude 3 Opus | $15 | $75 |
| Ollama | Free | Free |
Switching Models
You can switch models at any time:
- Go to your agent's settings
- Select Model tab
- Choose new provider and model
- Test the agent
- Publish changes
Note: Different models may respond differently to the same prompt. Always test after switching.
Next Steps
- Knowledge Base - Add context to your agent
- Tools - Enable agent capabilities
- Deployment - Go live with your agent