Rate Limits & Credits #

Understanding Nellie API rate limits and the credit system.

Rate Limits #

The Nellie API enforces rate limits to ensure fair usage and system stability.

Limits Overview #

Limit Type	Value	Scope
Burst limit	1 request per 6 seconds	Per API key
Daily limit	15 requests per day	Per API key
Concurrency	Unlimited queued jobs	Per account

Burst Rate Limit #

You can make a maximum of 1 request every 6 seconds (approximately 10 requests per minute).

Requests that arrive faster will receive a 429 Too Many Requests response:

{
  "success": false,
  "error": "Rate limit exceeded: Please wait a few seconds between requests",
  "errorCode": "RATE_LIMIT_EXCEEDED"
}

Daily Rate Limit #

Each API key is limited to 15 requests per day. The counter resets 24 hours after your first request of the day.

When exceeded:

{
  "success": false,
  "error": "Daily rate limit exceeded (15 requests/day)",
  "errorCode": "RATE_LIMIT_EXCEEDED"
}

What Counts Against Limits #

Action	Counts?
`POST /v1/book`	✅ Yes
`GET /v1/status/{id}`	❌ No
`GET /v1/configuration`	❌ No
`GET /v1/models`	❌ No
`GET /v1/usage`	❌ No

Only creating new generation jobs counts against your rate limits. Polling for status and fetching configuration is unlimited.

Handling Rate Limits #

Basic Retry Logic #

import time
from nellie_api import Nellie, RateLimitError

def create_with_retry(prompt: str, max_retries: int = 3) -> str:
    client = Nellie()
    
    for attempt in range(max_retries):
        try:
            book = client.books.create(prompt=prompt)
            return book.request_id
            
        except RateLimitError:
            if attempt < max_retries - 1:
                wait_time = 6 * (attempt + 1)
                print(f"Rate limited. Waiting {wait_time}s...")
                time.sleep(wait_time)
            else:
                raise
    
    raise Exception("Max retries exceeded")

Exponential Backoff #

import time
import random
from nellie_api import Nellie, RateLimitError

def create_with_backoff(prompt: str, max_retries: int = 5) -> str:
    client = Nellie()
    
    for attempt in range(max_retries):
        try:
            book = client.books.create(prompt=prompt)
            return book.request_id
            
        except RateLimitError:
            if attempt < max_retries - 1:
                # Exponential backoff with jitter
                base_wait = 6 * (2 ** attempt)
                jitter = random.uniform(0, base_wait * 0.1)
                wait_time = base_wait + jitter
                
                print(f"Rate limited. Waiting {wait_time:.1f}s (attempt {attempt + 1})")
                time.sleep(wait_time)
            else:
                raise
    
    raise Exception("Max retries exceeded")

Rate Limiter Class #

import time
from threading import Lock
from collections import deque

class RateLimiter:
    ""Token bucket rate limiter.""
    
    def __init__(self, requests_per_period: int = 1, period_seconds: float = 6):
        self.rate = requests_per_period
        self.period = period_seconds
        self.tokens = deque()
        self.lock = Lock()
    
    def acquire(self):
        ""Block until a request can be made.""
        with self.lock:
            now = time.time()
            
            # Remove expired tokens
            while self.tokens and self.tokens[0] < now - self.period:
                self.tokens.popleft()
            
            # If at limit, wait for oldest token to expire
            if len(self.tokens) >= self.rate:
                wait_time = self.tokens[0] + self.period - now
                if wait_time > 0:
                    time.sleep(wait_time)
                return self.acquire()
            
            # Add new token
            self.tokens.append(now)

# Usage
limiter = RateLimiter(requests_per_period=1, period_seconds=6)

def create_book_limited(prompt: str):
    limiter.acquire()  # Block until allowed
    return client.books.create(prompt=prompt)

Batch Processing with Rate Limiting #

import time
from nellie_api import Nellie

def process_batch(prompts: list[str], rate_limit_seconds: float = 7) -> list[str]:
    ""Process multiple prompts while respecting rate limits.""
    client = Nellie()
    request_ids = []
    
    for i, prompt in enumerate(prompts):
        print(f"[{i+1}/{len(prompts)}] Starting: {prompt[:30]}...")
        
        book = client.books.create(prompt=prompt)
        request_ids.append(book.request_id)
        
        # Wait before next request (except for last one)
        if i < len(prompts) - 1:
            time.sleep(rate_limit_seconds)
    
    return request_ids

# Usage
prompts = ["Book 1", "Book 2", "Book 3"]
ids = process_batch(prompts)

Credits System #

How Credits Work #

Every book generation consumes credits from your account. Credits are pre-purchased and deducted upon successful completion.

Credit Costs #

Model	Base Cost
Nellie 2.0 (Standard)	250 credits
Nellie 3.0 (Premium)	500 credits

Factors Affecting Cost #

Factor	Impact
AI Model	2.0 = 250, 3.0 = 500 credits
Images enabled	Additional ~50%
Content length	Slight variation

When Credits Are Charged #

Credits are reserved when you start a job
Credits are charged when the job completes successfully
Credits are refunded if the job fails

Checking Your Balance #

from nellie_api import Nellie

client = Nellie()

# Get usage statistics
usage = client.get_usage()
print(f"Total requests: {usage.total_requests}")
print(f"Credits used: {usage.total_credits_used}")

# Calculate remaining (assuming 10,000 credit plan)
CREDIT_LIMIT = 10000
remaining = CREDIT_LIMIT - usage.total_credits_used
print(f"Credits remaining: {remaining}")

Pre-flight Credit Check #

def can_generate(model: str = "2.0", with_images: bool = False) -> bool:
    ""Check if there are enough credits for a generation.""
    client = Nellie()
    
    # Get model costs
    models = client.get_models()
    model_cost = next(
        (m.cost_per_book for m in models if m.id == model),
        500  # Default to higher cost
    )
    
    # Add image cost estimate
    if with_images:
        model_cost = int(model_cost * 1.5)
    
    # Check available credits
    usage = client.get_usage()
    available = CREDIT_LIMIT - usage.total_credits_used
    
    return available >= model_cost

# Usage
if can_generate(model="3.0", with_images=True):
    book = client.books.create(
        prompt="...",
        model="3.0",
        images=True
    )
else:
    print("Insufficient credits!")

Insufficient Credits Error #

If you don't have enough credits, the job will fail:

{
  "requestId": "uuid",
  "status": "failed",
  "error": "Insufficient credits",
  "creditsUsed": 0
}

Monitoring Usage #

Track Credit Consumption #

from datetime import datetime, timedelta

def get_daily_usage():
    ""Get usage for the current day.""
    usage = client.get_usage()
    
    today = datetime.utcnow().date()
    today_requests = [
        r for r in usage.recent_requests
        if r.created_at and 
           datetime.fromisoformat(r.created_at.replace('Z', '+00:00')).date() == today
    ]
    
    return {
        "request_count": len(today_requests),
        "credits_used": sum(r.credits_used for r in today_requests),
        "remaining_requests": 15 - len(today_requests)
    }

Set Up Alerts #

def check_usage_alerts():
    ""Check if usage thresholds are exceeded.""
    usage = client.get_usage()
    
    CREDIT_LIMIT = 10000
    ALERT_THRESHOLD = 0.8  # 80%
    
    used_percentage = usage.total_credits_used / CREDIT_LIMIT
    
    if used_percentage >= ALERT_THRESHOLD:
        send_alert(
            title="Credit Usage Alert",
            message=f"You've used {used_percentage*100:.0f}% of your credits"
        )
    
    # Daily request limit
    daily = get_daily_usage()
    if daily["remaining_requests"] <= 3:
        send_alert(
            title="Daily Limit Warning",
            message=f"Only {daily['remaining_requests']} requests remaining today"
        )

Usage Dashboard #

def print_usage_report():
    ""Print a formatted usage report.""
    client = Nellie()
    usage = client.get_usage()
    
    print("=" * 50)
    print("NELLIE API USAGE REPORT")
    print("=" * 50)
    print(f"Total Requests:     {usage.total_requests}")
    print(f"Total Credits Used: {usage.total_credits_used:,}")
    print(f"Credits Remaining:  {CREDIT_LIMIT - usage.total_credits_used:,}")
    print()
    print("Recent Requests:")
    print("-" * 50)
    
    for req in usage.recent_requests[:10]:
        status_icon = "✅" if req.status == "completed" else "❌"
        print(f"  {status_icon} {req.request_id[:8]}... | {req.status:10} | {req.credits_used:3} credits")

Best Practices #

Request Spacing #

# ✅ Good: Wait between requests
for prompt in prompts:
    book = client.books.create(prompt=prompt)
    time.sleep(7)  # Slightly over 6 seconds to be safe

# ❌ Bad: Rapid-fire requests
for prompt in prompts:
    book = client.books.create(prompt=prompt)  # Will hit rate limit!

Use Webhooks #

Webhooks don't count against rate limits and are more efficient:

# ✅ Good: Use webhook instead of polling
book = client.books.create(
    prompt="...",
    webhook_url="https://myapp.com/webhook"
)
# No need to poll - webhook notifies you

# Less efficient: Repeated polling
while True:
    status = client.books.retrieve(book.request_id)  # Repeated calls
    if status.is_complete():
        break
    time.sleep(120)

Pre-check Before Submitting #

def smart_generate(prompt: str, model: str = "2.0"):
    ""Check limits before generating.""
    
    # Check daily limit
    daily = get_daily_usage()
    if daily["remaining_requests"] <= 0:
        raise Exception("Daily request limit reached")
    
    # Check credits
    if not can_generate(model=model):
        raise Exception("Insufficient credits")
    
    # All checks passed - submit
    return client.books.create(prompt=prompt, model=model)

Queue Management #

For high-volume applications:

import queue
import threading
import time

class BookQueue:
    ""Queue that respects rate limits.""
    
    def __init__(self, rate_limit_seconds: float = 6):
        self.queue = queue.Queue()
        self.rate_limit = rate_limit_seconds
        self.worker = threading.Thread(target=self._process, daemon=True)
        self.worker.start()
    
    def submit(self, prompt: str, callback):
        ""Add a book request to the queue.""
        self.queue.put((prompt, callback))
    
    def _process(self):
        ""Process queue items with rate limiting.""
        client = Nellie()
        
        while True:
            prompt, callback = self.queue.get()
            
            try:
                book = client.books.create(prompt=prompt)
                callback(book, None)
            except Exception as e:
                callback(None, e)
            
            time.sleep(self.rate_limit)

# Usage
book_queue = BookQueue()

def on_complete(book, error):
    if error:
        print(f"Error: {error}")
    else:
        print(f"Started: {book.request_id}")

book_queue.submit("Book 1", on_complete)
book_queue.submit("Book 2", on_complete)
book_queue.submit("Book 3", on_complete)

Limits FAQ #

Can I increase my rate limits? #

Contact support@buzzleco.com for enterprise plans with higher limits.

What happens to queued jobs if I hit the limit? #

Jobs that are already queued continue processing. Rate limits only affect creating new jobs.

Do failed jobs count against my daily limit? #

Yes, each POST /v1/book request counts against your daily limit, regardless of outcome.

Can I have multiple API keys? #

Yes, but all keys share the same credit pool. Each key has its own rate limits.

How do I know when my daily limit resets? #

The limit resets 24 hours after your first request of the day. Check your recent requests to estimate the reset time.