Scrape a JavaScript SPA¶
Goal: scrape a single-page app where content is rendered client-side after the initial HTML loads.
A light (HTTP) job only sees the empty shell of a SPA. Use a browser tier (standard or stealth) and tell ScrapeNest when the page is ready with wait_until, then optionally drive it with actions.
Minimal example¶
Wait until the network goes idle (all XHR/fetch finished), then capture the rendered HTML:
Choosing wait_until¶
| Value | Ready when | Use for |
|---|---|---|
commit |
The server response starts. | Fastest; rarely enough for SPAs. |
domcontentloaded |
Initial HTML parsed. | Static/server-rendered pages. |
load |
All initial resources loaded (default). | Most pages. |
networkidle |
No network activity for a short window. | SPAs and lazy-loaded content. |
If networkidle never settles (e.g. polling/websockets keep the network busy), fall back to load and add an explicit wait action.
Interacting before capture¶
Use actions to dismiss banners, expand sections, search, or click through steps before the page is captured:
client.scrape_sync(
job_type="standard",
target_url="https://example.com/app",
wait_until="networkidle",
actions=[
{"type": "click", "selector": "#accept-cookies"},
{"type": "fill", "selector": "input[name=q]", "value": "widgets"},
{"type": "click", "selector": "button[type=submit]"},
{"type": "wait", "timeout_ms": 1500},
{"type": "scroll", "direction": "down"},
],
artifact_options={"include_html": True, "include_screenshot": True},
)
curl -X POST "https://api.scrapenest.com/api/v1/jobs" \
-H "X-API-Key: sn_live_..." \
-H "Content-Type: application/json" \
-d '{
"job_type": "standard",
"target_url": "https://example.com/app",
"wait_until": "networkidle",
"actions": [
{"type": "click", "selector": "#accept-cookies"},
{"type": "fill", "selector": "input[name=q]", "value": "widgets"},
{"type": "click", "selector": "button[type=submit]"},
{"type": "wait", "timeout_ms": 1500},
{"type": "scroll", "direction": "down"}
],
"artifact_options": {"include_html": true, "include_screenshot": true}
}'
Extract directly¶
Combine rendering with extraction hooks to get JSON straight out of a rendered SPA:
client.scrape_sync(
job_type="standard",
target_url="https://example.com/app",
wait_until="networkidle",
artifact_options={"include_extraction": True},
extraction={"hooks": [{"hook_id": "items", "type": "css", "selector": ".result", "all_matches": True}]},
)
Troubleshooting¶
- Content still missing? The data may load only after interaction — add a
click/scrollaction and a shortwait. navigation_timeout? Raisenavigation_timeout_ms, or loosenwait_untilfromnetworkidletoload.- Blocked or challenged? Escalate to
stealth— see Work with protected targets.
See also¶
- Job Parameters → Browser options —
wait_until,actions,viewport. - Capture screenshots · Extract structured data