Rate Limits & Credits #
Understanding Nellie API rate limits and the credit system.
Rate Limits #
The Nellie API enforces rate limits to ensure fair usage and system stability.
Limits Overview #
| Limit Type | Value | Scope |
|---|---|---|
| Burst limit | 1 request per 6 seconds | Per API key |
| Daily limit | 15 requests per day | Per API key |
| Concurrency | Unlimited queued jobs | Per account |
Burst Rate Limit #
You can make a maximum of 1 request every 6 seconds (approximately 10 requests per minute).
Requests that arrive faster will receive a 429 Too Many Requests response:
{
"success": false,
"error": "Rate limit exceeded: Please wait a few seconds between requests",
"errorCode": "RATE_LIMIT_EXCEEDED"
}
Daily Rate Limit #
Each API key is limited to 15 requests per day. The counter resets 24 hours after your first request of the day.
When exceeded:
{
"success": false,
"error": "Daily rate limit exceeded (15 requests/day)",
"errorCode": "RATE_LIMIT_EXCEEDED"
}
What Counts Against Limits #
| Action | Counts? |
|---|---|
POST /v1/book |
✅ Yes |
GET /v1/status/{id} |
❌ No |
GET /v1/configuration |
❌ No |
GET /v1/models |
❌ No |
GET /v1/usage |
❌ No |
Only creating new generation jobs counts against your rate limits. Polling for status and fetching configuration is unlimited.
Handling Rate Limits #
Basic Retry Logic #
import time
from nellie_api import Nellie, RateLimitError
def create_with_retry(prompt: str, max_retries: int = 3) -> str:
client = Nellie()
for attempt in range(max_retries):
try:
book = client.books.create(prompt=prompt)
return book.request_id
except RateLimitError:
if attempt < max_retries - 1:
wait_time = 6 * (attempt + 1)
print(f"Rate limited. Waiting {wait_time}s...")
time.sleep(wait_time)
else:
raise
raise Exception("Max retries exceeded")
Exponential Backoff #
import time
import random
from nellie_api import Nellie, RateLimitError
def create_with_backoff(prompt: str, max_retries: int = 5) -> str:
client = Nellie()
for attempt in range(max_retries):
try:
book = client.books.create(prompt=prompt)
return book.request_id
except RateLimitError:
if attempt < max_retries - 1:
# Exponential backoff with jitter
base_wait = 6 * (2 ** attempt)
jitter = random.uniform(0, base_wait * 0.1)
wait_time = base_wait + jitter
print(f"Rate limited. Waiting {wait_time:.1f}s (attempt {attempt + 1})")
time.sleep(wait_time)
else:
raise
raise Exception("Max retries exceeded")
Rate Limiter Class #
import time
from threading import Lock
from collections import deque
class RateLimiter:
""Token bucket rate limiter.""
def __init__(self, requests_per_period: int = 1, period_seconds: float = 6):
self.rate = requests_per_period
self.period = period_seconds
self.tokens = deque()
self.lock = Lock()
def acquire(self):
""Block until a request can be made.""
with self.lock:
now = time.time()
# Remove expired tokens
while self.tokens and self.tokens[0] < now - self.period:
self.tokens.popleft()
# If at limit, wait for oldest token to expire
if len(self.tokens) >= self.rate:
wait_time = self.tokens[0] + self.period - now
if wait_time > 0:
time.sleep(wait_time)
return self.acquire()
# Add new token
self.tokens.append(now)
# Usage
limiter = RateLimiter(requests_per_period=1, period_seconds=6)
def create_book_limited(prompt: str):
limiter.acquire() # Block until allowed
return client.books.create(prompt=prompt)
Batch Processing with Rate Limiting #
import time
from nellie_api import Nellie
def process_batch(prompts: list[str], rate_limit_seconds: float = 7) -> list[str]:
""Process multiple prompts while respecting rate limits.""
client = Nellie()
request_ids = []
for i, prompt in enumerate(prompts):
print(f"[{i+1}/{len(prompts)}] Starting: {prompt[:30]}...")
book = client.books.create(prompt=prompt)
request_ids.append(book.request_id)
# Wait before next request (except for last one)
if i < len(prompts) - 1:
time.sleep(rate_limit_seconds)
return request_ids
# Usage
prompts = ["Book 1", "Book 2", "Book 3"]
ids = process_batch(prompts)
Credits System #
How Credits Work #
Every book generation consumes credits from your account. Credits are pre-purchased and deducted upon successful completion.
Credit Costs #
| Model | Base Cost |
|---|---|
| Nellie 2.0 (Standard) | 250 credits |
| Nellie 3.0 (Premium) | 500 credits |
Factors Affecting Cost #
| Factor | Impact |
|---|---|
| AI Model | 2.0 = 250, 3.0 = 500 credits |
| Images enabled | Additional ~50% |
| Content length | Slight variation |
When Credits Are Charged #
- Credits are reserved when you start a job
- Credits are charged when the job completes successfully
- Credits are refunded if the job fails
Checking Your Balance #
from nellie_api import Nellie
client = Nellie()
# Get usage statistics
usage = client.get_usage()
print(f"Total requests: {usage.total_requests}")
print(f"Credits used: {usage.total_credits_used}")
# Calculate remaining (assuming 10,000 credit plan)
CREDIT_LIMIT = 10000
remaining = CREDIT_LIMIT - usage.total_credits_used
print(f"Credits remaining: {remaining}")
Pre-flight Credit Check #
def can_generate(model: str = "2.0", with_images: bool = False) -> bool:
""Check if there are enough credits for a generation.""
client = Nellie()
# Get model costs
models = client.get_models()
model_cost = next(
(m.cost_per_book for m in models if m.id == model),
500 # Default to higher cost
)
# Add image cost estimate
if with_images:
model_cost = int(model_cost * 1.5)
# Check available credits
usage = client.get_usage()
available = CREDIT_LIMIT - usage.total_credits_used
return available >= model_cost
# Usage
if can_generate(model="3.0", with_images=True):
book = client.books.create(
prompt="...",
model="3.0",
images=True
)
else:
print("Insufficient credits!")
Insufficient Credits Error #
If you don't have enough credits, the job will fail:
{
"requestId": "uuid",
"status": "failed",
"error": "Insufficient credits",
"creditsUsed": 0
}
Monitoring Usage #
Track Credit Consumption #
from datetime import datetime, timedelta
def get_daily_usage():
""Get usage for the current day.""
usage = client.get_usage()
today = datetime.utcnow().date()
today_requests = [
r for r in usage.recent_requests
if r.created_at and
datetime.fromisoformat(r.created_at.replace('Z', '+00:00')).date() == today
]
return {
"request_count": len(today_requests),
"credits_used": sum(r.credits_used for r in today_requests),
"remaining_requests": 15 - len(today_requests)
}
Set Up Alerts #
def check_usage_alerts():
""Check if usage thresholds are exceeded.""
usage = client.get_usage()
CREDIT_LIMIT = 10000
ALERT_THRESHOLD = 0.8 # 80%
used_percentage = usage.total_credits_used / CREDIT_LIMIT
if used_percentage >= ALERT_THRESHOLD:
send_alert(
title="Credit Usage Alert",
message=f"You've used {used_percentage*100:.0f}% of your credits"
)
# Daily request limit
daily = get_daily_usage()
if daily["remaining_requests"] <= 3:
send_alert(
title="Daily Limit Warning",
message=f"Only {daily['remaining_requests']} requests remaining today"
)
Usage Dashboard #
def print_usage_report():
""Print a formatted usage report.""
client = Nellie()
usage = client.get_usage()
print("=" * 50)
print("NELLIE API USAGE REPORT")
print("=" * 50)
print(f"Total Requests: {usage.total_requests}")
print(f"Total Credits Used: {usage.total_credits_used:,}")
print(f"Credits Remaining: {CREDIT_LIMIT - usage.total_credits_used:,}")
print()
print("Recent Requests:")
print("-" * 50)
for req in usage.recent_requests[:10]:
status_icon = "✅" if req.status == "completed" else "❌"
print(f" {status_icon} {req.request_id[:8]}... | {req.status:10} | {req.credits_used:3} credits")
Best Practices #
Request Spacing #
# ✅ Good: Wait between requests
for prompt in prompts:
book = client.books.create(prompt=prompt)
time.sleep(7) # Slightly over 6 seconds to be safe
# ❌ Bad: Rapid-fire requests
for prompt in prompts:
book = client.books.create(prompt=prompt) # Will hit rate limit!
Use Webhooks #
Webhooks don't count against rate limits and are more efficient:
# ✅ Good: Use webhook instead of polling
book = client.books.create(
prompt="...",
webhook_url="https://myapp.com/webhook"
)
# No need to poll - webhook notifies you
# Less efficient: Repeated polling
while True:
status = client.books.retrieve(book.request_id) # Repeated calls
if status.is_complete():
break
time.sleep(120)
Pre-check Before Submitting #
def smart_generate(prompt: str, model: str = "2.0"):
""Check limits before generating.""
# Check daily limit
daily = get_daily_usage()
if daily["remaining_requests"] <= 0:
raise Exception("Daily request limit reached")
# Check credits
if not can_generate(model=model):
raise Exception("Insufficient credits")
# All checks passed - submit
return client.books.create(prompt=prompt, model=model)
Queue Management #
For high-volume applications:
import queue
import threading
import time
class BookQueue:
""Queue that respects rate limits.""
def __init__(self, rate_limit_seconds: float = 6):
self.queue = queue.Queue()
self.rate_limit = rate_limit_seconds
self.worker = threading.Thread(target=self._process, daemon=True)
self.worker.start()
def submit(self, prompt: str, callback):
""Add a book request to the queue.""
self.queue.put((prompt, callback))
def _process(self):
""Process queue items with rate limiting.""
client = Nellie()
while True:
prompt, callback = self.queue.get()
try:
book = client.books.create(prompt=prompt)
callback(book, None)
except Exception as e:
callback(None, e)
time.sleep(self.rate_limit)
# Usage
book_queue = BookQueue()
def on_complete(book, error):
if error:
print(f"Error: {error}")
else:
print(f"Started: {book.request_id}")
book_queue.submit("Book 1", on_complete)
book_queue.submit("Book 2", on_complete)
book_queue.submit("Book 3", on_complete)
Limits FAQ #
Can I increase my rate limits? #
Contact support@buzzleco.com for enterprise plans with higher limits.
What happens to queued jobs if I hit the limit? #
Jobs that are already queued continue processing. Rate limits only affect creating new jobs.
Do failed jobs count against my daily limit? #
Yes, each POST /v1/book request counts against your daily limit, regardless of outcome.
Can I have multiple API keys? #
Yes, but all keys share the same credit pool. Each key has its own rate limits.
How do I know when my daily limit resets? #
The limit resets 24 hours after your first request of the day. Check your recent requests to estimate the reset time.
Related Documentation #
- Errors -- Error handling reference
- Authentication -- API key management
- Best Practices -- Recommended patterns
- FAQ -- Frequently asked questions