Skip to content

Errors & Retries

This page covers how the Python SDK surfaces failures, how they map to HTTP responses, and how to build resilient integrations with idempotency and backoff.

Exception hierarchy

All SDK exceptions derive from ScrapeNestError, so you can catch everything with one except or handle each case precisely.

ScrapeNestError                 # base class — catch-all
├── ScrapeJobFailed             # job ran but ended with status="failed"
├── ScrapeJobTimeout            # job did not finish within the sync wait window
└── ScrapeNestAPIError          # the API returned a non-2xx HTTP response
Exception Raised when Useful attributes
ScrapeJobFailed scrape_sync(...) completes with status="failed" and raise_on_failure=True (the default). job_id, failure_reason
ScrapeJobTimeout scrape_sync(...) does not reach a terminal status within timeout seconds. The job is still running server-side. job_id, timeout_seconds
ScrapeNestAPIError The API rejects the request (auth, validation, rate limit, server error). status_code, body
from scrapenest import (
    ScrapeNestError,
    ScrapeJobFailed,
    ScrapeJobTimeout,
    ScrapeNestAPIError,
)

Job failure vs. API error

Two very different things can go wrong — distinguish them:

  • API error (ScrapeNestAPIError) — your request was rejected before a job ran: a bad key, an invalid payload, or a rate limit. Fix the request or back off.
  • Job failure (ScrapeJobFailed) — your request was accepted, a job ran, but it ended in failed (e.g. the target timed out or blocked us). Inspect failure_reason and consider a higher tier or a retry.
try:
    result = client.scrape_sync(job_type="stealth", target_url=url, timeout=60)
except ScrapeNestAPIError as e:
    if e.status_code == 429:
        ...  # rate limited — back off and retry
    elif e.status_code in (401, 403):
        ...  # auth/permission problem — do not retry
    elif e.status_code == 422:
        ...  # payload invalid — fix and do not retry
    else:
        raise
except ScrapeJobFailed as e:
    ...  # job ran but failed — see failure_reason

HTTP status codes

ScrapeNestAPIError.status_code mirrors the REST API:

Status Meaning Retry?
400 Malformed request body (not JSON, missing fields). No — fix the request.
401 API key invalid, revoked, or expired. No.
403 Key lacks the required scope, or org/key IP allowlist blocked the call. No.
422 Payload failed validation (e.g. invalid_wait_until, invalid_viewport_width). The body lists the failing fields. No — fix the parameters.
429 Rate limit or quota exceeded. Yes — honor Retry-After / X-RateLimit-Reset.
5xx Transient server-side error. Yes — exponential backoff.

See Rate Limits & Quotas for the headers (X-RateLimit-Remaining, X-RateLimit-Reset) and tier limits.

Common job failure reasons

When a job ends in failed, failure_reason tells you why. Common values:

failure_reason What it means What to do
timeout The target did not respond within timeout_ms. Raise timeout_ms, or set wait_until more loosely.
navigation_timeout A browser job could not finish navigation. Increase navigation_timeout_ms; try wait_until: "domcontentloaded".
stealth_blocked The target blocked the request or challenge flow. Escalate the tier (lightstandardstealth) and tune the visitor profile; see Work with protected targets.
browser_crashed The browser engine crashed mid-job. Retry; if persistent, contact support with the job_id.

The full diagnostic context (status code, headers, console logs) is available in the job's metadata artifact — see Observability.

Idempotency

Network blips can make you submit the same job twice. Pass an idempotency_token and ScrapeNest returns the original job instead of creating a duplicate:

client.scrape_async(
    job_type="light",
    target_url="https://example.com/orders/42",
    idempotency_token="orders-42-2026-06-06",
)
curl -X POST "https://api.scrapenest.com/api/v1/jobs" \
  -H "X-API-Key: sn_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "job_type": "light",
    "target_url": "https://example.com/orders/42",
    "idempotency_token": "orders-42-2026-06-06"
  }'

Use a token that is stable for the logical unit of work (an order id, a date-scoped key) so retries collapse onto one job.

Retry strategy

A safe default for production:

import time
from scrapenest import ScrapeNestClient, ScrapeNestAPIError, ScrapeJobTimeout

def scrape_with_retry(client, *, attempts=4, **kwargs):
    delay = 1.0
    for attempt in range(1, attempts + 1):
        try:
            return client.scrape_sync(**kwargs)
        except ScrapeNestAPIError as e:
            # Only retry rate limits and server errors
            if e.status_code != 429 and e.status_code < 500:
                raise
        except ScrapeJobTimeout:
            # Job is still running — switch to async + webhooks instead of busy-waiting
            raise
        if attempt == attempts:
            raise
        time.sleep(delay)
        delay = min(delay * 2, 30)  # exponential backoff, capped

Guidelines:

  • Retry 429 and 5xx. Do not retry 400, 401, 403, 422 — they will fail again.
  • Always pair retries with an idempotency_token so a retry can never create a second job.
  • For 429, prefer the Retry-After / X-RateLimit-Reset value over a fixed delay.
  • If you frequently hit ScrapeJobTimeout, you are holding connections too long — move to scrape_async + webhooks.

Next steps