Skip to content

Scheduled Jobs

Run a scrape on a recurring cron schedule. ScrapeNest fires the schedule, creates a normal job each time, and records the outcome so you have a full run history.

Scheduling is built on Temporal-native schedules, so runs happen exactly once per window with predictable catch-up behavior - not a best-effort timer.

How billing works

A scheduled run is a normal job. It consumes credits by engine weight (Light 1, Standard 5, Stealth 30) exactly like a manual submission, and it is billed only on a successful, delivered result. There is no separate charge for scheduling itself.

If a run cannot be created because you are out of credits, or because the engine is no longer included in your plan, the run is skipped - it costs nothing and appears in the run history as skipped_quota or skipped_not_allowed. You can subscribe to the schedule.run_skipped webhook to be notified.

Plan limits

Scheduling is a paid capability. Your plan controls whether you can schedule, how many schedules you can keep, and the minimum interval between runs:

Plan Scheduling Max schedules Minimum interval
Free Not included - -
Starter Included 5 1 hour
Pro Included 25 15 minutes
Business Included 100 5 minutes
Enterprise Included Unlimited 60 seconds

The minimum interval is a guardrail: a schedule that fires more often than your plan allows is rejected at creation time, not silently throttled. The engine you schedule must also be included in your plan (for example, Stealth requires Pro or higher).

Create a schedule

from scrapenest import ScrapeNestClient

client = ScrapeNestClient(api_key="sn_live_...", base_url="https://api.scrapenest.com")

schedule = client.schedules.create(
    name="hourly-homepage",
    cron="0 * * * *",            # top of every hour
    timezone="Europe/Paris",     # IANA timezone
    job_type="light",
    target_url="https://example.com",
)
print(schedule.id, schedule.status, schedule.next_run_at)
curl -X POST "https://api.scrapenest.com/api/v1/schedules" \
  -H "X-API-Key: sn_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "name": "hourly-homepage",
    "cron": "0 * * * *",
    "timezone": "Europe/Paris",
    "job_type": "light",
    "target_url": "https://example.com"
  }'

Cron and timezone

cron is a standard 5-field expression (minute hour day-of-month month day-of-week). It is evaluated in the timezone you provide (any IANA name, e.g. Europe/Paris or UTC), including daylight-saving transitions. A few examples:

Cron Meaning
0 * * * * Every hour, on the hour
*/15 * * * * Every 15 minutes
0 6 * * * Every day at 06:00
0 8 * * 1 Every Monday at 08:00

Overlap policy

If a run is still executing when the next fire time arrives, overlap_policy decides what happens:

  • skip (default) - do not start the next run until the current one finishes.
  • buffer_one - queue at most one pending run.
  • allow - start the next run regardless.

Manage schedules

# List and iterate
for s in client.schedules.iter():
    print(s.name, s.cron, s.status)

# Pause and resume (keeps the schedule, stops firing)
client.schedules.pause(schedule.id)
client.schedules.resume(schedule.id)

# Update the definition
client.schedules.update(
    schedule.id,
    name="hourly-homepage",
    cron="0 */2 * * *",
    job_type="light",
    target_url="https://example.com",
)

# Delete
client.schedules.delete(schedule.id)

If you downgrade your plan below what a schedule requires (too many schedules, an engine your new plan does not include, or an interval that is now too frequent), the affected schedules are automatically paused and you receive a notification. Re-enable them after upgrading, or edit them to fit the new plan.

Run history

Every fire is recorded, whether it created a job or was skipped:

runs = client.schedules.runs(schedule.id, limit=20)
for run in runs.items:
    print(run.fired_at, run.status, run.job_id)

Run statuses:

  • minted - a job was created (job_id points to it; follow it with the Jobs API or webhooks).
  • skipped_quota - out of credits this period; nothing was charged.
  • skipped_not_allowed - the engine is not included in your current plan.
  • error - an unexpected failure creating the run.

In the console

The Schedules section of the console lists your schedules, lets you create and edit them, pause or resume with one click, and drill into the run history for each one.