Skip to content

Webhooks end-to-end

Goal: stop polling. Submit jobs with scrape_async, and let ScrapeNest call your server when each job finishes — then download the artifact.

This guide stitches the webhooks reference into one runnable flow: register an endpoint → submit jobs → verify the signature → download the result.

1. Register an endpoint

Create a webhook endpoint subscribed to the events you care about. Save the signing secret (whsec_…) shown once on creation.

import httpx

resp = httpx.post(
    "https://api.scrapenest.com/api/v1/webhook-endpoints",
    headers={"X-API-Key": "sn_live_..."},
    json={
        "url": "https://api.yoursite.com/webhooks/scrapenest",
        "enabled_events": ["job.completed", "artifact.ready"],
    },
)
secret = resp.json()["secret"]   # starts with whsec_ — store securely, shown once
curl -X POST "https://api.scrapenest.com/api/v1/webhook-endpoints" \
  -H "X-API-Key: sn_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://api.yoursite.com/webhooks/scrapenest",
    "enabled_events": ["job.completed", "artifact.ready"]
  }'

2. Submit jobs asynchronously

Fire jobs and move on — no waiting:

from scrapenest import ScrapeNestClient

client = ScrapeNestClient(api_key="sn_live_...", base_url="https://api.scrapenest.com")

client.scrape_async(
    job_type="standard",
    target_url="https://example.com/app",
    wait_until="networkidle",
    artifact_options={"include_html": True, "include_screenshot": True},
    tags=["batch:nightly"],
)

3. Receive and verify the webhook

Your endpoint must verify the signature before trusting the payload. Use the raw request body and a constant-time comparison.

import hmac, hashlib, base64, time, json
from flask import Flask, request

app = Flask(__name__)
SECRET = "whsec_..."   # the signing secret from step 1

@app.route("/webhooks/scrapenest", methods=["POST"])
def handle():
    svix_id = request.headers["Svix-Id"]
    svix_ts = request.headers["Svix-Timestamp"]
    svix_sig = request.headers["Svix-Signature"]
    payload = request.get_data(as_text=True)   # RAW body — do not use request.json

    # Reject replays
    if abs(int(time.time()) - int(svix_ts)) > 300:
        return "expired", 400

    signed = f"{svix_id}.{svix_ts}.{payload}"
    secret_bytes = base64.b64decode(SECRET.split("_")[1])
    expected = "v1," + base64.b64encode(
        hmac.new(secret_bytes, signed.encode(), hashlib.sha256).digest()
    ).decode()

    if not any(hmac.compare_digest(expected, s) for s in svix_sig.split(" ")):
        return "bad signature", 400

    event = json.loads(payload)

    # Acknowledge immediately, then process asynchronously
    enqueue(event)
    return "", 202

Full language examples and the signing scheme are in Verifying Signatures.

4. Act on the event

The artifact.ready event already carries a ready-to-use download_url, so you can fetch the bytes with no extra API call:

def enqueue(event):
    if event["event_type"] == "artifact.ready":
        data = event["data"]
        if data["artifact_type"] == "screenshot":
            import httpx
            content = httpx.get(data["download_url"]).content
            save_screenshot(data["job_id"], content)
    elif event["event_type"] == "job.completed":
        data = event["data"]
        if data["status"] == "failed":
            alert(f"Job {data['job_id']} failed: {data.get('failure_reason')}")

See the Events Reference for every event and payload shape.

Production checklist

  • Acknowledge fast. Return 202 within 10 seconds; do heavy work (downloads, DB writes) on a queue. See delivery semantics.
  • Be idempotent. Deduplicate on event_id — delivery is at-least-once.
  • Allowlist our IPs at your firewall (see overview).
  • Rotate the secret periodically via POST /api/v1/webhook-endpoints/{id}/rotate-secret.

See also