Web Scraping & Crawling
What is Playwright?
A Microsoft-maintained library for driving Chrome, Firefox, and WebKit headlessly with a unified API.
Playwright is an open-source automation framework released by Microsoft in 2020. It exposes a single API, available in TypeScript, Python, Java, and .NET, that controls Chromium, Firefox, and WebKit. It replaced Puppeteer at most teams within a year of release thanks to true cross-browser support and built-in waiting primitives that eliminate the flake that plagues Selenium tests.
For scraping, Playwright is the default choice when JavaScript rendering is unavoidable. It exposes network interception, request mocking, and waitForResponse hooks that let a scraper grab JSON from an XHR call directly instead of parsing the rendered DOM.
For testing, Playwright Test is bundled and offers parallel execution, trace viewers, and visual regression. It is the closest thing to a one-stop browser-automation library the JS ecosystem has.
In the wild
- →
await page.goto(url)followed bypage.locator('h1').textContent() - →Recording a network HAR while a page loads to inspect every request
- →Running a test suite across Chromium, Firefox, and WebKit in parallel
How Brand.dev uses playwright
Endpoints in the Brand.dev API where this concept comes up directly.
FAQ
Playwright vs Puppeteer?
Playwright is cross-browser, has better waiting semantics, and is actively developed by an MS team that includes the original Puppeteer authors. Puppeteer is Chrome-only and effectively in maintenance mode.
Playwright vs Selenium?
Playwright is faster, has a friendlier API, and ships modern testing tools. Selenium has 15+ years of ecosystem and works with browsers Playwright does not (e.g., Safari pre-iOS automation).
Can Playwright bypass bot detection?
Not on its own. You need stealth plugins, residential proxies, and human-like timing. Even then, well-protected sites detect automation reliably.
Related terms
A real browser engine running without a visible UI, controlled programmatically through an automation API.
Programmatically extracting structured data from websites that were designed to be read by humans.
The Document Object Model, a tree of objects that represents an HTML document in memory and lets JavaScript manipulate it.