Skip to main content
POST
/
scrape
Fetch a URL as HTML, markdown, or links
curl --request POST \
  --url https://api.webscrape.ai/v1/scrape \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: <api-key>' \
  --data '
{
  "website_url": "https://example.com",
  "clean": false,
  "parse_mode": "accurate",
  "tag_truncate": true,
  "extract_links": false,
  "include_tags": [
    "<string>"
  ],
  "exclude_tags": [
    "<string>"
  ],
  "headers": {},
  "max_age": 1,
  "stealth": false
}
'
{
  "status": "completed",
  "request_id": "req_aB3xY9Kp",
  "data": {
    "request_id": "<string>",
    "html": "<string>",
    "cleaned": true,
    "links": [
      {
        "url": "<string>",
        "text": "<string>"
      }
    ],
    "metadata": {
      "title": "<string>",
      "description": "<string>",
      "language": "<string>"
    },
    "latency_ms": 123
  },
  "credits_used": 1,
  "credits_remaining": 499
}

Authorizations

X-API-Key
string
header
required

Generate from the dashboard. Format: wsg_live_<32 base62 chars>.

Body

application/json
website_url
string<uri>
required
Example:

"https://example.com"

clean
boolean
default:false

Convert HTML to cleaned markdown.

parse_mode
enum<string>
default:accurate

Cleaner mode when clean: true. accurate is more forgiving on malformed pages; speed is faster.

Available options:
accurate,
speed
tag_truncate
boolean
default:true

When clean: true, replace inline images with their alt text to reduce noise. Disable to keep the original image tags.

Include a deduplicated list of outbound links.

include_tags
string[]

Whitelist of HTML tags to keep when clean: true.

exclude_tags
string[]

Blacklist of HTML tags to drop when clean: true.

headers
object

Custom request headers forwarded to the fetcher (e.g. a specific User-Agent or Accept-Language). Providing any headers disables URL caching for this request.

max_age
integer

URL-cache opt-in (three-state).

  • Omitted — no cache: every call fetches fresh.
  • >0 — return a cached entry if it's fresher than max_age seconds, otherwise fetch and write.

Stealth requests, custom-headered requests, and URLs with query strings or fragments are never cached.

Required range: x >= 0
stealth
boolean
default:false

Use stealth mode. +2 credits.

Response

Successfully fetched.

status
enum<string>
required

Discriminator for the envelope variant.

Available options:
completed
request_id
string
required

Per-request id. Also returned as the X-Request-ID response header. Include it when reporting issues.

Example:

"req_aB3xY9Kp"

data
object
required
credits_used
integer
required
Example:

1

credits_remaining
integer
required
Example:

499