SmartBrowse - webscrape.ai

SmartBrowse runs a saved recipe in a real Chrome session. Recipes click, type, scroll, and paginate. Dispatch them on demand or on a schedule. The minimum dispatch flow is in the API reference; this page goes deeper — recipe design, drift scores, schedules, and when SmartBrowse beats Scrape or SmartScraper.

When SmartBrowse is the right tool

Use SmartBrowse

Data is gated behind JS loading screens
Pagination is JavaScript-driven (URL doesn’t change)
The page shape varies per session and you need consistent navigation

Skip SmartBrowse

The data is on a plain static page — use Scrape or SmartScraper
You want structured JSON from a URL — use SmartScraper
There’s a real JSON API or RSS feed sitting underneath — just call it directly

Authoring a recipe

Recipes are built visually in the SmartBrowse studio. The API only dispatches and polls — there’s no recipe-creation endpoint. A recipe captures three things:

Actions — the sequence of clicks, types, scrolls, and waits to replay.
Extraction shape — the schema applied to each page reached.
Pagination — how to move from one page to the next (next-button click, URL increment, scroll-to-bottom).

Build the smallest recipe that works. A 3-step recipe (search, click first result, extract) is much more resilient than a 12-step one that pre-clicks five filters. Fewer steps means fewer things to break when the page changes.

Run a recipe

curl -X POST https://api.webscrape.ai/v1/smartbrowse/recipes/m3Yc2tFvN8q/run \
  -H "X-API-Key: wsg_live_..."

Polling is fine for one-off runs. For scheduled or high-volume runs, set up a webhook on the recipe — completion will POST to your endpoint instead of leaving you with a poll loop open. See Webhooks.

Drift score

When a run completes, data.result.drift is a number between 0 and 1 measuring how much the page has changed since the recipe was authored. 0 means it looks identical to what you trained against; higher means more drift. There’s no fixed “too high” cutoff. Read drift together with data.items_extracted:

Low drift + expected item count → all good.
High drift + expected item count → fine, but worth a spot-check.
High drift + zero or far fewer items → time to re-author the recipe in the studio.

data.result.warnings lists non-fatal issues: pagination stalled before the configured limit, a recipe step retried, an action target moved. Worth a glance even on completed runs.

Scheduling

Schedules are set per recipe from the SmartBrowse dashboard. Standard cron syntax:

Schedule	Cron
Every 15 minutes	`/15 * * *`
Hourly on the hour	`0 * * * *`
Daily at 9 AM UTC	`0 9 * * *`
Weekly Monday 6 AM UTC	`0 6 * * 1`

Each scheduled run still bills at 2 credits per page extracted. Set a reasonable cadence and an expected-pages cap so a runaway schedule can’t drain your balance.

What about CAPTCHAs?

SmartBrowse runs in a real Chrome session, so most challenge interstitials clear without a manual step in your recipe. If a recipe keeps failing on a specific challenge, email support with the request_id from the failed run.

Limits

Run timeout: 15 minutes per dispatch. Split long crawls across multiple recipes.
Pages per run and runs per 30 days are set by your plan. GET /v1/smartbrowse/usage (free to poll) or the dashboard tells you where you stand.

If a run would push you past your monthly quota, dispatch returns 429 with error.details.reason: sb_runs_per_month. If your balance can’t cover the per-page cost of even one page, dispatch returns 402 with error.code: insufficient_credits.

Cost

2 credits per page extracted (initial load + each pagination hop). Billed on completion. Cancelled or timed-out runs cost 0. No stealth surcharge — SmartBrowse already runs real Chrome. See Credits for the full table.

​When SmartBrowse is the right tool

Use SmartBrowse

Skip SmartBrowse

​Authoring a recipe

​Run a recipe

​Drift score

​Scheduling

​What about CAPTCHAs?

​Limits

​Cost

When SmartBrowse is the right tool

Authoring a recipe

Run a recipe

Drift score

Scheduling

What about CAPTCHAs?

Limits

Cost