FAQ

General Questions

What is AppleRouter?

AppleRouter is a unified AI API gateway that provides access to multiple AI model providers (OpenAI, Anthropic Claude, Google Gemini, etc.) through a single API interface. You only need one API key to access all supported models.

Which models are supported?

AppleRouter supports models from:

OpenAI: GPT-4o, GPT-4 Turbo, GPT-3.5, DALL-E 3, Whisper, TTS, Sora
Anthropic: Claude Opus 4.5, Claude Sonnet 4, Claude 3.5 Sonnet, Claude 3 Haiku
Google: Gemini 2.0 Flash, Gemini 1.5 Pro, Imagen, Veo
Others: Kling, Jimeng, and more

See the Models page for the complete list.

Is AppleRouter compatible with OpenAI SDK?

Yes! AppleRouter is fully compatible with the OpenAI API format. Just change the base_url to https://api.applerouter.ai/v1 and use your AppleRouter API key.

from openai import OpenAI
client = OpenAI(
    api_key="YOUR_APPLEROUTER_KEY",
    base_url="https://api.applerouter.ai/v1"
)

Can I use Claude or Gemini's native API format?

Yes, AppleRouter auto-detects the API format based on request headers:

Headers	Detected Format
`x-api-key` + `anthropic-version`	Anthropic Claude
`x-goog-api-key` or `key` query param	Google Gemini
Default	OpenAI

How is pricing calculated?

AppleRouter uses pay-as-you-go pricing based on token usage. Visit the Console to view current pricing for each model.

Error Handling

Error 401: Invalid API Key

Cause: The API key is missing, invalid, or expired.Solutions:

Verify your API key is correct
Check the key hasn’t been deleted in the Console
Ensure the Authorization header format is correct: Bearer YOUR_API_KEY

# Correct format
curl https://api.applerouter.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-xxx..."

Error 429: Rate Limited

Cause: Too many requests in a short period.Solutions:

Implement exponential backoff retry logic
Reduce request frequency
Contact support for higher rate limits

import time
from openai import RateLimitError

def call_with_retry(func, max_retries=5):
    for i in range(max_retries):
        try:
            return func()
        except RateLimitError:
            time.sleep(2 ** i)
    raise Exception("Max retries exceeded")

Error 400: Invalid Request

Cause: Request parameters are incorrect or missing.Common issues:

Missing required fields (e.g., model, messages)
Invalid model name
Incorrect message format
Unsupported parameters for the selected model

Solution: Check your request against the API Reference.

Error 500/502/503: Server Error

Cause: Temporary server or upstream provider issue.Solutions:

Retry after a short delay
If persistent, check status page for outages
Consider implementing model fallback

models = ["gpt-4o", "claude-sonnet-4-20250514", "gemini-2.0-flash"]
for model in models:
    try:
        response = client.chat.completions.create(model=model, messages=messages)
        break
    except Exception:
        continue

Timeout errors

Cause: Request took too long to complete.Solutions:

Reduce max_tokens to get shorter responses
Set appropriate timeout values in your client
Use streaming for long responses
For video generation, use async polling instead of waiting

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.applerouter.ai/v1",
    timeout=120.0  # 2 minutes
)

Usage Tips

How to reduce token usage?

Be concise: Write clear, focused prompts
Limit history: Only include relevant conversation context
Use system prompts: Set expectations upfront to reduce back-and-forth
Set max_tokens: Limit response length when appropriate
Choose the right model: Use smaller models (GPT-3.5, Haiku) for simple tasks

How to choose between models?

Use Case	Recommended Models
Complex reasoning	GPT-4o, Claude Opus 4.5, Gemini 1.5 Pro
Fast responses	GPT-4o-mini, Claude 3 Haiku, Gemini 2.0 Flash
Code generation	Claude Sonnet 4, GPT-4o
Creative writing	Claude Opus 4.5, GPT-4o
Cost-sensitive	GPT-3.5 Turbo, Claude 3 Haiku
Image generation	DALL-E 3, Gemini Imagen
Video generation	Sora, Kling, Veo

How to use streaming effectively?

Streaming improves perceived latency by showing responses as they’re generated.

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
    stream=True
)

for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="", flush=True)

Some features like response_format (JSON mode) may not work with streaming on all models.

How to handle long conversations?

Summarize history: Periodically summarize older messages
Sliding window: Keep only the last N messages
Selective context: Only include messages relevant to the current topic

def trim_messages(messages, max_messages=10):
    system = [m for m in messages if m["role"] == "system"]
    others = [m for m in messages if m["role"] != "system"]
    return system + others[-max_messages:]

Account & Billing

How do I check my usage?

Current balance
Usage history by model
API request logs

How do I add funds?

Visit the billing section in the Console to add credits to your account.

What happens when my balance is zero?

API requests will return a 402 Payment Required error. Add funds to continue using the service.

Get Started

Guides

Integrations

General Questions

Error Handling

Usage Tips

Account & Billing

Still Need Help?

Contact Support

API Reference

Get Started

Guides

Integrations

​General Questions

​Error Handling

​Usage Tips

​Account & Billing

​Still Need Help?

Contact Support

API Reference

General Questions

Error Handling

Usage Tips

Account & Billing

Still Need Help?