Errors & Retries¶
This page covers how the Python SDK surfaces failures, how they map to HTTP responses, and how to build resilient integrations with idempotency and backoff.
Exception hierarchy¶
All SDK exceptions derive from ScrapeNestError, so you can catch everything with one except or handle each case precisely.
ScrapeNestError # base class — catch-all
├── ScrapeJobFailed # job ran but ended with status="failed"
├── ScrapeJobTimeout # job did not finish within the sync wait window
└── ScrapeNestAPIError # the API returned a non-2xx HTTP response
| Exception | Raised when | Useful attributes |
|---|---|---|
ScrapeJobFailed |
scrape_sync(...) completes with status="failed" and raise_on_failure=True (the default). |
job_id, failure_reason |
ScrapeJobTimeout |
scrape_sync(...) does not reach a terminal status within timeout seconds. The job is still running server-side. |
job_id, timeout_seconds |
ScrapeNestAPIError |
The API rejects the request (auth, validation, rate limit, server error). | status_code, body |
Job failure vs. API error¶
Two very different things can go wrong — distinguish them:
- API error (
ScrapeNestAPIError) — your request was rejected before a job ran: a bad key, an invalid payload, or a rate limit. Fix the request or back off. - Job failure (
ScrapeJobFailed) — your request was accepted, a job ran, but it ended infailed(e.g. the target timed out or blocked us). Inspectfailure_reasonand consider a higher tier or a retry.
try:
result = client.scrape_sync(job_type="stealth", target_url=url, timeout=60)
except ScrapeNestAPIError as e:
if e.status_code == 429:
... # rate limited — back off and retry
elif e.status_code in (401, 403):
... # auth/permission problem — do not retry
elif e.status_code == 422:
... # payload invalid — fix and do not retry
else:
raise
except ScrapeJobFailed as e:
... # job ran but failed — see failure_reason
HTTP status codes¶
ScrapeNestAPIError.status_code mirrors the REST API:
| Status | Meaning | Retry? |
|---|---|---|
400 |
Malformed request body (not JSON, missing fields). | No — fix the request. |
401 |
API key invalid, revoked, or expired. | No. |
403 |
Key lacks the required scope, or org/key IP allowlist blocked the call. | No. |
422 |
Payload failed validation (e.g. invalid_wait_until, invalid_viewport_width). The body lists the failing fields. |
No — fix the parameters. |
429 |
Rate limit or quota exceeded. | Yes — honor Retry-After / X-RateLimit-Reset. |
5xx |
Transient server-side error. | Yes — exponential backoff. |
See Rate Limits & Quotas for the headers (X-RateLimit-Remaining, X-RateLimit-Reset) and tier limits.
Common job failure reasons¶
When a job ends in failed, failure_reason tells you why. Common values:
failure_reason |
What it means | What to do |
|---|---|---|
timeout |
The target did not respond within timeout_ms. |
Raise timeout_ms, or set wait_until more loosely. |
navigation_timeout |
A browser job could not finish navigation. | Increase navigation_timeout_ms; try wait_until: "domcontentloaded". |
stealth_blocked |
The target blocked the request or challenge flow. | Escalate the tier (light → standard → stealth) and tune the visitor profile; see Work with protected targets. |
browser_crashed |
The browser engine crashed mid-job. | Retry; if persistent, contact support with the job_id. |
The full diagnostic context (status code, headers, console logs) is available in the job's metadata artifact — see Observability.
Idempotency¶
Network blips can make you submit the same job twice. Pass an idempotency_token and ScrapeNest returns the original job instead of creating a duplicate:
Use a token that is stable for the logical unit of work (an order id, a date-scoped key) so retries collapse onto one job.
Retry strategy¶
A safe default for production:
import time
from scrapenest import ScrapeNestClient, ScrapeNestAPIError, ScrapeJobTimeout
def scrape_with_retry(client, *, attempts=4, **kwargs):
delay = 1.0
for attempt in range(1, attempts + 1):
try:
return client.scrape_sync(**kwargs)
except ScrapeNestAPIError as e:
# Only retry rate limits and server errors
if e.status_code != 429 and e.status_code < 500:
raise
except ScrapeJobTimeout:
# Job is still running — switch to async + webhooks instead of busy-waiting
raise
if attempt == attempts:
raise
time.sleep(delay)
delay = min(delay * 2, 30) # exponential backoff, capped
Guidelines:
- Retry
429and5xx. Do not retry400,401,403,422— they will fail again. - Always pair retries with an
idempotency_tokenso a retry can never create a second job. - For
429, prefer theRetry-After/X-RateLimit-Resetvalue over a fixed delay. - If you frequently hit
ScrapeJobTimeout, you are holding connections too long — move toscrape_async+ webhooks.
Next steps¶
- Python SDK — the full client reference.
- Rate Limits & Quotas — limits, headers, and
enforcement_mode. - Webhooks — stop polling; get notified on completion.