← cookbook/puppeteer-headless-scrape
6 min read·advanced·updated just now

Puppeteer headless scrape → grid

When Firecrawl is not enough — full-browser scraping with login + JS rendering, then push the rows to an Instadash grid.

#scraping#puppeteer#browser

What it builds

  • A headless Chromium session with form-based login
  • DOM extraction via page.evaluate()
  • Failure-snapshot capture for selector drift
  • Push to a private grid — credentials never leave the runtime

The key step

const browser = await puppeteer.launch({ headless: true })
const page    = await browser.newPage()
 
await page.goto('https://news.ycombinator.com/login')
await page.type('input[name="acct"]', HN_USER)
await page.type('input[name="pw"]',   HN_PASS)
await Promise.all([
  page.click('input[type="submit"]'),
  page.waitForNavigation(),
])
 
const rows = await page.evaluate(() => /* scrape DOM */)
await fetch('https://instadash.io/ingest', { /* … */ })
note
This is the core of the recipe. The full file (including setup, error handling, and the surrounding scaffolding) lives in the GitHub folder linked below — clone or copy it directly.

Run it

bash⎘ copy
git clone https://github.com/instadashio/instadash-recipes
cd instadash-recipes/puppeteer-headless-scrape
npm install
cp .env.example .env       # fill in your keys
npm start

Stack

puppeteernodetypescript
Full source on GitHub

README, runnable code, .env.example, dependencies — all in one folder.

↗ view on github
─ related recipesview all →