Python SDK¶

The official scrapenest Python SDK wraps the ScrapeNest API so you can submit jobs, wait for results, inspect artifacts, and download artifact bytes without writing HTTP plumbing. If you can write five lines of Python, you can scrape a page.

Prefer raw HTTP?

Everything the SDK does maps 1:1 to the REST API. Every example on this page shows the equivalent curl so you can port it to any language. See the API Reference.

Install¶

pip install scrapenest-sdk

The package installs as scrapenest-sdk; the import name is scrapenest.

Requires Python 3.10+. The only runtime dependency is httpx.

Your first scrape¶

Create a client with your API key, then call scrape to submit a job and block until it finishes:

Python

from scrapenest import ScrapeNestClient

    client = ScrapeNestClient(
        api_key="sn_live_...",
        base_url="https://api.scrapenest.com",
    )

    result = client.scrape(
        job_type="light",
        target_url="https://example.com",
    )

    print(result.status)          # "succeeded"
    print(result.artifact_count)  # 2
    ```

=== "curl"
```bash # 1. Submit the job
curl -X POST "https://api.scrapenest.com/v1/jobs" \
 -H "X-API-Key: sn*live*..." \
 -H "Content-Type: application/json" \
 -d '{"job_type": "light", "target_url": "https://example.com"}'

    # 2. Poll until status is "succeeded" or "failed"
    curl "https://api.scrapenest.com/v1/jobs/JOB_ID?include_download_urls=true" \
      -H "X-API-Key: sn_live_..."
    ```

That's it. `scrape` handles submission and polling for you and raises if the job fails.

!!! note "`base_url` is the API host"
Pass the host only - `https://api.scrapenest.com`. The SDK appends `/v1/...` itself. If you omit `base_url`, it defaults to production.

## Configuring the client

```python
client = ScrapeNestClient(
    api_key="sn_live_...",            # required - from Console → Developer → API Keys
    base_url="https://api.scrapenest.com",  # optional; this is the default
    timeout=30.0,                     # per-request HTTP timeout in seconds
    verify=True,                      # TLS verification; keep enabled outside local dev
)

The client holds a connection pool. Reuse one instance for the lifetime of your process, and close it when done:

client.close()

# …or use it as a context manager:
with ScrapeNestClient(api_key="sn_live_...", base_url="https://api.scrapenest.com") as client:
    result = client.scrape(job_type="light", target_url="https://example.com")

Two ways to run a job¶

`scrape` - submit and wait¶

Best for scripts and request/response workflows where you want the result inline. It submits the job, polls until completion, and returns a ScrapeResult.

result = client.scrape(
    job_type="stealth",
    target_url="https://protected.example.com",
    timeout=60,                 # max seconds to wait (1–120, default 30)
    raise_on_failure=True,      # raise ScrapeJobFailed if the job fails (default)
)

ScrapeResult fields:

Field	Type	Description
`job_id`	`str`	The job identifier - use it to fetch artifacts or correlate logs.
`status`	`"succeeded"` \| `"failed"`	Terminal status.
`failure_reason`	`str \\| None`	Populated when `status == "failed"` (e.g. `timeout`, `stealth_blocked`).
`artifact_count`	`int`	Number of artifacts produced.
`completed_at`	`datetime`	When the job finished.

Sync waits cap at 120 seconds

scrape(timeout=...) accepts up to 120 seconds. For long-running stealth jobs or high throughput, use submit plus webhooks instead of holding a connection open.

`create_job` / `submit` - submit and move on¶

Best for high throughput, long jobs, or fan-out. create_job returns immediately with a CreateJobResponse carrying the job_id; you collect the result later by polling jobs.get or by handling a job.completed webhook. submit is an alias for the same behavior; it is not an async/await coroutine.

Python

created = client.create_job(
job_type="light",
target_url="https://example.com",
tags=["batch:nightly"],
)
print(created.job_id, created.status) # "...", "queued"

    # Later - fetch the full job and its artifacts:
    job = client.jobs.get(created.job_id)
    print(job.status)                        # "queued" | "running" | "succeeded" | "failed"
    for artifact in job.artifacts:
        print(artifact.artifact_type, artifact.artifact_id)
    ```

=== "curl"
```bash
curl -X POST "https://api.scrapenest.com/v1/jobs" \
 -H "X-API-Key: sn*live*..." \
 -H "Content-Type: application/json" \
 -d '{"job_type": "light", "target_url": "https://example.com", "tags": ["batch:nightly"]}'

    curl "https://api.scrapenest.com/v1/jobs/JOB_ID" \
      -H "X-API-Key: sn_live_..."
    ```

## Passing job options

Any keyword argument you pass to `scrape`, `submit`, or `jobs.create` is sent straight through as a job parameter. This is how you reach the full power of the platform - rendering controls, screenshots, proxies, and built-in extraction:

```python
result = client.scrape(
    job_type="stealth",
    target_url="https://example.com/listings",
    os_name="macos",
    wait_until="networkidle",
    viewport={"width": 1920, "height": 1080},
    artifact_options={
        "include_html": True,
        "include_screenshot": True,
    },
    extraction={
        "hooks": [
            {"hook_id": "title", "type": "css", "selector": "h1"},
            {"hook_id": "prices", "type": "css", "selector": ".price", "all_matches": True},
        ]
    },
)

See the Job Parameters reference for every accepted option and which worker tier supports it.

Reading artifacts¶

A finished job produces one or more artifacts (HTML, screenshot, extracted JSON, metadata). Fetch the job to list them, then use the artifact helper to download bytes or text:

Python

job = client.jobs.get(result.job_id)

    html_artifact = next(a for a in job.artifacts if a.artifact_type == "html")
    html = client.artifacts.download_text(html_artifact.artifact_id)
    ```

=== "curl"
```bash # 1. Get the presigned URL
curl "https://api.scrapenest.com/v1/artifacts/ARTIFACT_ID/download" \
 -H "X-API-Key: sn*live*..." # → {"download_url": "https://...", "expires_at": "..."}

    # 2. Download the bytes
    curl -L "PRESIGNED_DOWNLOAD_URL" -o result.html
    ```

!!! note "Presigned URLs are short-lived"
Download URLs expire (default ~15 minutes). Request a fresh one each time you need the bytes; never cache the URL itself. You can also receive a ready-to-use URL on the [`artifact.ready` webhook](../webhooks/events.md).

### Artifact helper methods

```python
download = client.artifacts.get_download_url("ARTIFACT_ID", ttl_seconds=600)
content = client.artifacts.download_bytes("ARTIFACT_ID")
text = client.artifacts.download_text("ARTIFACT_ID")

download_bytes and download_text first request a presigned URL from the API, then fetch the artifact from object storage.

Monitors¶

Use client.monitors to run a scrape on a cron cadence. Each fire creates a normal job that bills as usual; see Monitoring for plan limits and skip behavior, and Change Detection to watch a page for changes.

monitor = client.monitors.create(
    name="hourly-homepage",
    cron="0 * * * *",
    timezone="Europe/Paris",
    job_type="light",
    target_url="https://example.com",
    # optional: watch for changes and alert by webhook + email
    detection={"enabled": True, "mode": "selector", "selector": ".price",
               "notify": {"email": ["alerts@acme.eu"]}},
)

for m in client.monitors.iter():
    print(m.name, m.cron, m.status)

client.monitors.pause(monitor.id)
client.monitors.resume(monitor.id)

# Run history - each fire, and whether it minted a job or was skipped.
for run in client.monitors.runs(monitor.id).items:
    print(run.fired_at, run.status, run.job_id)

client.monitors.delete(monitor.id)

The async client exposes the same methods under await client.monitors.*.

Managing your account from code¶

Everything you can do in the console is also available in the SDK, gated by API key scopes:

# Webhook endpoints (scope: webhooks.manage) - register receivers, debug deliveries
endpoint = client.webhook_endpoints.create(name="prod", url="https://hooks.your-app.com/scrapenest")
for message in client.webhook_endpoints.iter_messages(endpoint.endpoint_id):
    print(message.event_type, message.status)

# Verify inbound webhook signatures without hand-rolling the scheme:
from scrapenest.webhooks import verify_webhook
event = verify_webhook(raw_body, request_headers, secret="whsec_...")  # raises on bad signature

# API keys (scope: api_keys.manage) - least-privilege keys per consumer
key = client.api_keys.create(name="ci", scopes=["jobs.create", "jobs.read", "artifacts.read"])

# Retention (scope: retention.manage) - policy and legal holds
client.retention.create_hold(scope_type="job", scope_ref=job_id, justification="Dispute #4242")

# Org security (scope: org.manage) - IP allowlist
client.org.set_ip_allowlist(["203.0.113.0/24"])

# Audit logs (scope: audit_logs.read) - read the org audit trail, newest first
for entry in client.audit_logs.iter(severity="high"):
    print(entry.occurred_at, entry.event_type, entry.actor_type, entry.result)

Tokens and signing secrets are returned exactly once, at creation or rotation - store them in your secret manager immediately.

Error handling¶

The SDK raises typed exceptions you can catch precisely:

from scrapenest import (
    ScrapeJobFailed,
    ScrapeJobTimeout,
    ScrapeNestAPIError,
)

try:
    result = client.scrape(job_type="stealth", target_url="https://example.com", timeout=60)
except ScrapeJobFailed as e:
    print("Target blocked or job failed:", e.failure_reason)
except ScrapeJobTimeout as e:
    print("Still running - collect later via webhook. job_id:", e.job_id)
except ScrapeNestAPIError as e:
    print("API rejected the request:", e.status_code, e.body)

See Errors & Retries for the full exception hierarchy, status-code mapping, idempotency, and retry guidance.

Next steps¶

Job Parameters - every option you can pass.
Guides - copy-paste recipes for common tasks.
Webhooks - the recommended pattern for submit at scale.
Errors & Retries - handle failures and rate limits cleanly.

Python SDK¶

Install¶

Your first scrape¶

Two ways to run a job¶

scrape - submit and wait¶

create_job / submit - submit and move on¶

Reading artifacts¶

Monitors¶

Managing your account from code¶

Error handling¶

Next steps¶

`scrape` - submit and wait¶

`create_job` / `submit` - submit and move on¶