Scrape a JavaScript SPA¶

Goal: scrape a single-page app where content is rendered client-side after the initial HTML loads.

A light (HTTP) job only sees the empty shell of a SPA. Use a browser tier (standard or stealth) and tell ScrapeNest when the page is ready with wait_until, then optionally drive it with actions.

Minimal example¶

Wait until the network goes idle (all XHR/fetch finished), then capture the rendered HTML:

Pythoncurl

from scrapenest import ScrapeNestClient

client = ScrapeNestClient(api_key="sn_live_...", base_url="https://api.scrapenest.com")

result = client.scrape(
    job_type="standard",
    target_url="https://example.com/app",
    wait_until="networkidle",
)

curl -X POST "https://api.scrapenest.com/v1/jobs" \
  -H "X-API-Key: sn_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "job_type": "standard",
    "target_url": "https://example.com/app",
    "wait_until": "networkidle"
  }'

Choosing `wait_until`¶

Value	Ready when	Use for
`commit`	The server response starts.	Fastest; rarely enough for SPAs.
`domcontentloaded`	Initial HTML parsed.	Static/server-rendered pages.
`load`	All initial resources loaded (default).	Most pages.
`networkidle`	No network activity for a short window.	SPAs and lazy-loaded content.

If networkidle never settles (e.g. polling/websockets keep the network busy), fall back to load and add an explicit wait action.

Interacting before capture¶

Use actions to dismiss banners, expand sections, search, or click through steps before the page is captured:

Pythoncurl

client.scrape(
    job_type="standard",
    target_url="https://example.com/app",
    wait_until="networkidle",
    actions=[
        {"type": "click", "selector": "#accept-cookies"},
        {"type": "fill", "selector": "input[name=q]", "value": "widgets"},
        {"type": "click", "selector": "button[type=submit]"},
        {"type": "wait", "timeout_ms": 1500},
        {"type": "scroll", "direction": "down"},
    ],
    artifact_options={"include_html": True, "include_screenshot": True},
)

curl -X POST "https://api.scrapenest.com/v1/jobs" \
  -H "X-API-Key: sn_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "job_type": "standard",
    "target_url": "https://example.com/app",
    "wait_until": "networkidle",
    "actions": [
      {"type": "click", "selector": "#accept-cookies"},
      {"type": "fill", "selector": "input[name=q]", "value": "widgets"},
      {"type": "click", "selector": "button[type=submit]"},
      {"type": "wait", "timeout_ms": 1500},
      {"type": "scroll", "direction": "down"}
    ],
    "artifact_options": {"include_html": true, "include_screenshot": true}
  }'

Extract directly¶

Combine rendering with extraction hooks to get JSON straight out of a rendered SPA:

client.scrape(
    job_type="standard",
    target_url="https://example.com/app",
    wait_until="networkidle",
    artifact_options={"include_extraction": True},
    extraction={"hooks": [{"hook_id": "items", "type": "css", "selector": ".result", "all_matches": True}]},
)

Troubleshooting¶

Content still missing? The data may load only after interaction - add a click/scroll action and a short wait.
navigation_timeout? Raise navigation_timeout_ms, or loosen wait_until from networkidle to load.
Blocked or challenged? Escalate to stealth - see Work with protected targets.