Introducing webscrape.ai: the web, as an API
Send a URL and a description of the fields you want, and get clean, schema-validated JSON back in one call, with the proxies and browsers handled for you.
Scraping the modern web is mostly plumbing. Sites render their content with JavaScript, change their markup on every deploy, and sit behind bot detection that keeps getting better. So teams spend weeks on the parts that have nothing to do with the data they actually want: renting proxies, writing CSS selectors by hand, and patching crawlers that break when someone renames a class.
webscrape.ai does that plumbing for you. One endpoint takes a URL and a description of what you want, and hands back clean, schema-validated JSON.
What you get in one call
Every request runs through the pipeline you'd otherwise have to build and maintain yourself:
- Fetching that gets through. A tiered stack starts with a plain HTTP request and escalates to a stealthed browser only when a site blocks the cheaper option. You pay for the heavy tier only when a page actually forces it.
- Content cleaning. We strip out boilerplate like navigation and ads before extraction runs, so it works on the content and not the chrome around it. Less noise means faster, cheaper extraction.
- AI extraction. Describe the fields you want in plain language, or send a JSON schema. You get back data shaped to match and validated against it, with one automatic repair pass if the first result doesn't conform.
Your first request
Point it at a product page and ask for the fields you care about:
curl https://api.webscrape.ai/v1/smartscraper \
-H "X-API-Key: wsg_live_your_key_here" \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/product/42",
"prompt": "Extract the product name, price, and whether it is in stock"
}'
You get back what you asked for, ready to use:
{
"status": "completed",
"data": {
"name": "Aeron Chair",
"price": 1395.00,
"in_stock": true
},
"credits_used": 5,
"credits_remaining": 495,
"request_id": "req_..."
}
One wallet, no surprises
Proxies, browser time, and model inference all come out of one credit balance. No separate proxy invoice. No per-gigabyte bandwidth meter. No "browser minutes" line item. And credits are charged on success only, so a fetch that fails costs you nothing.
Grab a key and point it at something you actually need scraped. The free credits are enough to test it on real pages before you commit to anything. Over the next few weeks we'll get into how the fetch tiers decide when to escalate, and why we trust described fields over hand-written selectors.