all tags// tag

#extraction

7 posts tagged #extraction.

How to get LLM-ready data (markdown or JSON) from any URL

How to get LLM-ready data from any URL: pull specific fields as JSON, or strip a page to its clean content, with PDFs converted to markdown automatically.

guide5 min

What is structured data extraction?

Structured data extraction turns a web page into schema-validated JSON by naming the fields you want, not the CSS selectors that break when a site redesigns.

guide5 min

How to turn a product page into JSON in one call

How to scrape a product page into JSON in one API call: send the URL plus the fields you want, get schema-validated data back, and pay only on success.

guide5 min

How to pull structured data out of an HTML table

How to pull structured data out of an HTML table in one API call: describe the columns you want and get each row back as schema-validated JSON, no selectors.

guide5 min

Why your scraper returns null after a redesign

Why your scraper returns null after a redesign: a catalog of the silent failure modes behind an empty result, and how to turn page drift into a warning instead.

engineering6 min

JSON Schema or a plain-language prompt: which to hand the extractor

Plain-language prompt or JSON Schema for AI extraction? You always send a prompt; an output_schema optionally pins field names and types. When to add one.

guide4 min

Why structured extraction beats CSS selectors

Structured extraction vs CSS selectors: hand-written selectors break on a redesign; describing the data survives it. How we keep the AI version repeatable.

engineering4 min