Question 1

Am I billed for failed requests?

Accepted Answer

No. You are not billed for failed requests or requests where we are blocked (rarely happens). Credits are only consumed on successful responses.

Question 2

How does the Website Crawler API work?

Accepted Answer

POST a starting URL to /v1/web/crawl. The crawler discovers same-domain links, follows them up to your maxDepth, filters them by your urlRegex pattern, fetches each page, and converts the HTML to clean Markdown. The response is an array of pages with markdown + per-page metadata (URL, title, status, depth).

Question 3

What's the difference between the Crawler API and the Sitemap API?

Accepted Answer

Sitemap Extractor reads the URLs the site declares in sitemap.xml — fast, no rendering, just URL list. The Crawler walks links from the start URL to discover pages (whether or not they're in the sitemap) AND extracts page content as Markdown. Use Sitemap when you only need URLs; use Crawler when you need URLs + body text.

Question 4

How much does it cost to crawl a website?

Accepted Answer

1 credit per successfully crawled page. Failed pages don't consume credits. Cap costs by setting maxPages and maxDepth. For example, maxDepth=2 and maxPages=50 crawls at most the first 50 pages within 2 hops of the start URL.

Question 5

Can I filter which pages get crawled?

Accepted Answer

Yes — three knobs: (1) urlRegex matches URLs you want to follow (e.g. "/blog/.*" to scope to a blog), (2) maxDepth caps how many hops from the start URL, (3) followSubdomains controls whether subdomains are in scope.

Question 6

Does it follow subdomains?

Accepted Answer

By default it stays on the apex domain (and treats www as equivalent). Pass followSubdomains: true to also crawl subdomains like docs.example.com or blog.example.com from a start URL on example.com.

Question 7

Can I use this for RAG or LLM ingestion?

Accepted Answer

Yes — that's the most common use case. Crawl a docs site or knowledge base, store the returned Markdown in a vector DB, and your LLM has the full corpus indexed. The output is already LLM-ready: clean text, no HTML noise, no boilerplate.

Question 8

Is there a free tier for the website crawler?

Accepted Answer

Yes — the free tier covers thousands of monthly page crawls. A single API key also unlocks single-page web scraping (HTML, Markdown, images), sitemaps, brand data, and the rest of the Context.dev stack.

Website Crawler {API}

What You Get

Multi-page Markdown extraction

Depth & page limits

URL regex filtering

Subdomain following

How It Works

POST a starting URL

Links are discovered

Pages converted to Markdown

Results returned

API Response

Frequently asked questions

Ship an agent that actually knows things.