Extraction modes
Theextract field is available on Fetch, Crawl, and Interact. Always pair it with "output": ["json"] — the extracted data is returned in outputs.json.
mode: "schema" — LLM-based extraction
Describe the fields you want in plain English. The API uses an LLM to locate and extract them from the page.
mode: "selectors" — CSS selector extraction
Use CSS selectors when you know the DOM structure. This is deterministic and does not use an LLM.
type options:
"text"— inner text content"html"— inner HTML"attr"— value ofattribute
mode: "instruction" — free-form LLM instruction
Give the LLM an open-ended instruction when the schema isn’t predictable in advance.
mode: "page" — raw page extraction
Returns the full page as a single field without any structuring. Useful when you want the LLM in your own application to do the structuring.
Extraction errors
If extraction fails or the LLM can’t find the requested fields, the API returns status422 with code extraction_validation_error. The response includes a diagnostics object explaining which fields failed.
Cost
LLM-based extraction (schema, instruction, page) adds 1–3 credits depending on page length. Selector-based extraction (selectors) adds 1 credit.